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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines : and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-hased or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, f<&r 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or libraiy of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1 -1 786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1 786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

15 specific domain or truncation of the peptides encoded by SEQ1DN0:1-I786and 3573-5358. A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO: 1 -1 786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:I-l 786 and 3573-5358 that uniquely identifies or 
representsthe sequence information of SEQ IDNO:l-1786and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that containsthe segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression sectors. Nucleic acid sequences (or then- 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 

2 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation ol anti-sense DNA or RN A, their chemical analogs and the like- 
In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly pre ferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1-1786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et aL Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO:l-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1 -1786 and 3573-5358: and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of the 

15 present invention also include, but aj e not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g, orthoiogs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of Die invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1 -1 786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 



3 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutical!)' acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention m a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides accoiding to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR. use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RNA, their chemical analogs and the like. For example, when the expression of an mKNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to delect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et aL Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical^ acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

35 expression or biological activity. 
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The present invention f urther relates to methods for delecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, he 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention aiso provides a method for detecting the polypeptides of the 

10 invention in a sample comprising contacting the sample with a compound that binds to and iorms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
15 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can he utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g.. 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the tar get gene products. Compounds and other substances can 

5 
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effect such modulation cither on the level of target gene/piotein expression or target protein 

activity 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 poJynucleotides to which they have homology (set forth in Table 2); for which they have a 

signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful Tor a variety of applications, as described 
herein, including use in arrays for detection. 

10 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
15 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells' 9 as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process 

The terms "complementary" or "complementarity' 7 refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 

cells (PGCs)" refers to a small population of cells .vet aside from other cell lineages particularly 

from the yolk sac. mesenteries, or gonadal ridges during embryogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs arc the source from which GSCs and ES cells 

5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment." EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence 11 when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

\ ? The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

20 sequences herein A is adenine, C is cytosine. T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at Jeast about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably Jess than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 1 00 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 1 5 to about 50 nucleotides, more preferably from about J 7 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 

be used in polymerase chain reaction (PCR) ? various hybridization procedures or microarray 

procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 

fragment or segment may uniquely identify each polynucleotide sequence of the present 

5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 

]DNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et ah, 1992, PCR Methods Appl 1 :24 1-250). They may 

10 be labeled by nick translation, Klenow fill-in reaction, PCR ? or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et ah, 1 989, Molecular Cloning: A Laboratory Manual. Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons. New York NY, both of which are incorporated herein by reference in their 

] 5 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l*1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 

20 1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. Jn the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer. The probability that the twenty- five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (l-r4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. Th probability that a twenty-mer with a single mismatch can be 

3 5 detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF. means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 

acid sequences. For example, a promoter is operably associated or operably linked with a coding 

5 sequence if the promoter controls the transcription of the coding sequence. While operably 

linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 

elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 

transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 

10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 

1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 

20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 

30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

9 
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The term "derivative'' refer.* 10 polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g.. with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with poJyethyiene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
5 in human proteins. 

The term ,, variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g.„ 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
20 affinities, or degradation/turnover raie. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative'' amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, bydrophobicity, hydrophiheity, and/or the amphipatbjc 
25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucinc, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 

the polypeptides of the invention. For example, such alterations may change polypeptide 

characteristics such as ligand-binciing affinities, interchain affinities, or degradation/turnover 

rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 

5 for expression, scale up and the like in the host cells chosen for expression. For example. 

cysteine residues can be deleted or substituted with another amino acid residue in order to 

eliminate disulfide bridges. 

The tenns "purified* or "substantially purified" as used herein denotes that the indicated 

nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of Hie indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion. or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g.. microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or'protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units inteoded for use 

11 
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in yeast or eukaryotic expression systems preferably include a lender sequence enabling 

extracellular secretion of translated protein by a host eel). Alternatively, where recombinant 

protein is expressed without a leader or transport sequence, it may include an amino terminal 

methionine residue. This residue may or may not be subsequently cleaved from the expressed 

5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 

transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 

express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic . 

The term "secreted" includes a pTOtein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 

20 "Secreted* proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. lnterleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
lnterleukin-I Receptor Antagonist, see Arend, W.P. et. ai. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DMA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4> 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1 X SSC/0.1% SDS at 68°C) ; and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoheonucleotides. additional exemplary stringent 

hybridization conditions include washing in 6X SSCYO.05% sodium pyrophosphate at 37°C (for 

14 -base oligonucleotides), 48°C (for 17-base ohgos), 55°C (for 20-base oligonucleotides), and 

60°C (for 23-base oligonucleotides). 

? As used herein, "substantially equivalent' 1 can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

substitutions, deletions, or additions, the net effect of which does not result in an adverse 

functional dissimilarity between the reference and subject sequences. Typically, such a 

substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% (/.<?., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 

35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 

13 
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term "transfection" refers to the taking up o1 an expression vector by a suitable host cell, whether 

or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 

of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herem. an "uptake modulating fragment," UMF, means a series of nucleotides 

5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 

using known UMFs as a target sequence or target motif with the computer-based systems 

described below. The presence and activity of a UMF can be confirmed by attaching the 

suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

with an appropriate host under appropriate conditions and the uptake of the marker sequence is 

1 0 determined. As described above, a UMF will increase the frequency of uptake of a linked 

marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

15 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787o572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to ; a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acjd sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1 787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in lipand polypeptides include receptor-binding 
domains. 

34 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA r e.g.. mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDN A 

5 The present invention also provides penes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can he isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 he obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1-1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1 786 and 3573-5358may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of ^enes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri. and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e, specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 

preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species van at ions thereof. Allelic and species 

5 variationscan be routinely determined by comparing the sequence provided SEQ ID NO:l-1786 

and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical. 

preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 

isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 

nucleic acid molecules coding for the same amino acid sequences as do the specific ORPs disclosed 

] 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
AJtschul S.F. et al. J. Mol. BioL 21 :403-4 10(1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices {e.g., 

16 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distani 
choices (c.£., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Ammo acid sequence deletions generally range from about ] to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular largeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences arc 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et ah, 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is (he cassette mutagenesis 
technique described in Wells et ah, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et a)., supra, and Current 
Protocols in Molecular Biology,, Ausubel et ah Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

17 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences 

The polynucleotides of the invention additionally include the complement of any of the 
5 polynucleotides recited above, l~he polynucleotide can be DNA (genomic, cDNA T amplified, or 
synthetic) or RN A. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for detennining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

10 protein coding sequences corresponding to any one of SEQ ID MO: 1-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

1 5 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g.. 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO.I-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacteria): pBs, phagescript, PsiX174. pBluescript SK. 
P Bs KS, pNlISa, pNHlOa, pNHlSa, pNH46a (Stratagcne); pTrc99A, pKK223-3, pKK233-3, 
PDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXT1, pSG (Stratagcne) 
P SVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al , 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantiy. Many 
suitable expression control sequences are known in the art. General methods of expressing 
iecombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

1 0 Emymology 1 85, 537*566 (1 990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfectcd) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CM V immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-1. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, eg., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRPl gene, and a promoter derived from a highly- expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphogJycerate kinase (PGK) 7 a-facfor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coll. Bacillus subtilis. Salmonella typhimurium and various species 
within the genera Pseudomonas. Streptotnyces. and Staphylococcus, although others may also be 
employed as a matter of choice 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, lor example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification 

1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et a!., Not. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used (o generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and prefeiably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

43 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358. or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 1 00, 250 or 500 nucleotides or an entire coding suand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO: 1-1786 and 3573-5358 are additionally provided. 

20 
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In one embodiment an antisense nucleic acid molecule is antisen.se to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NOT -1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 

10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA r but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 

15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g. , an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 

20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthme, xanthine, 
4-acetylcytosine : 5-(carboxyhydroxylmethyl) uracil, 5 -car boxy methyl aminomethyJ- 

25 2-tbiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, bcta-D-gaJactosylqueosine, 
inosine,N6MSopemenyladenine, 1 -methylguanine, 1-methylinosine, 2,2-dimethyI guanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adeninc. 
7-methylguanine, 5-methylaminomethyluraciK 5-methoxyaminomethyl-2-thiouraci). 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

30 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil ? 2-thiouracil, 4-thiouracil, 5-methyiuracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5~methyl-2-thiouracil, 

3- (3-aminoo-N-2-carboxypropyl) uracil, (acp3)w. and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

35 nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 

21 
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inserted nucleic acid will be of an antisense orientation to a target nucleic odd of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

protein, eg., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

an antisense nucleic acid molecule that binds to DNA duplexes, through speciiic interactions in 

the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

15 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol I] or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 cx-anomeric nucleic acid molecule. An a-anomcric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other (Gaultier et ai. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotidc (Inoue et al. 
(1987) Nucleic Acids Res 15: 613 1-61 4 8) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 

25 FEBSLett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETJES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 

30 single-stranded nucleic acid, such as a mRNA. to which they have a complementary region. 
Thus, ribozymes (e.g.. hammerhead ribozymes (described in Haselhoff and Gcrlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-l 9 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 

nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al. U.S. Pat 

No. 4.987.071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively. SECX mRNA can be 

used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 

5 molecules. See, e.g., BarteJ et aL 0993) Science 261 ; 141 ] -1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 

structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 

Anticancer Drug Des. 6: 569-84; Helene. et aL (1992) Ann. N. Y. Acad Set 660:27-36; and 

1 0 Mahcr ( 1 992) Bioassays 1 4 : 807- 1 1 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic- 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med 

15 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseuciopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 

20 standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence -specific modulation oi 
gene expression by, e.g. , inducing transcription or translation arrest or inhibiting replication. 

25 PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e . <>.. SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-O'Keefe (1996), 
above) 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can K linked 

using linkers ol appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases. and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn er al. (1 996) NucI Acids Kes 24: 

5 3357-63. For example, a DNA chain can be synthesized on n solid support using standard 

phosphoramiciie coupling chemistry, and modified nucleoside analogs, e.g., 

S'-^-inethoxytritylJarnino-S'-deoxy-thyrnidinephosphorarnidjte. can be used between the PNA 

and the 5' end of DNA (Mag <?/ a/. (]9$9) Nuci Acid Res 17: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5 1 PNA segment and a 3' 

1 0 DNA segment (Finn el ah (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3' PNA segment. See, Petersen ei al. (1975) Bioorg Med Chem 

Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

1 5 cell membrane (see, e.g., Lelsinger el al, ) 989, Proc. Nail Acad. Scl U.S.A. 86:6553-6556: 
Lemaitre etal, 1987, Proc. Natl Acad. Scl 84:648-652; PCT Publication No. W088/09K10) or 
the blood- brain barrier (see, e g . PCT Publication No. WOS9/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g.. Krol ef 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g. . a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, Oi 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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tiie polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See. for example, PCX International Publication 
No. WO94/12650, PCX International Publication No. WO92/2080S, and PCX International 
Publication No. WO91/09955. It is aJso contemplated that, in addition to heterologous promoter 
5 D"NA : amplifiable marker DNA (e.g.. ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along: with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

10 Xhe host cell can he a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAR dextran mediated transfection, or eleciroporation (Davis, 
L. et a).. Basic Me/hods in Molecular Biology (1986)). The host cells containing one of the 

15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. Xhese include, but arc not limited to, eukaryotic hosts such as HeLa cells, Cv-1 celL 

20 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as K. colt and B. subtiih. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 

25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3X3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

35 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937. HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
leplication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example. 
? SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
1 0 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, il may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacteria] 
strains include Escherichia coii, Bacillus subtilis, Salmonella typ/mmtrium, or any bacterial 
strain capable of expressing heterologous proteins. ]f the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters : enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

26 
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protein, or other sequences which alter or improve the function or stability oi' protein or RJMA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence., placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter oj 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing clement; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences arc deleted and new 

10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DN A has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

1 5 selectable marker is linked to the exogenous DNA. but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell penome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyl-transferasc (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5.272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al ; International Application No. 
PCT/US92/09627 (WO93/09222) by Sdden et al.: arid Internationa] Application No. 

25 PCT/US90/06436 (W09 1/06667) by Skoultchi et aL each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:J- 
] 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO: 1-1786 and 3573-5358 or (b) polynucleotides encoding any one of the ammo acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides thai hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:l 787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO: 1 787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or Ihey may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et aL Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chcm. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to earner molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing bv translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins arc also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
axe deleted so that the proteins are fully secreted from the cell in which ihey are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydropbilic, e.g., pharmaceutical Jy acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g. . an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest leveL the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed pjotein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes lor natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of die present invention can alternatively be purified from 
cells which have bee/? aherec] to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a fall length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual, Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about ]()0 
amnio acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains 

The purified polypeptides can be used in in vitro binding assays which are we I] known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to. for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

1 0 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexcd with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule, complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

15 The protein of the invention may also be expressed as a product of transgenic animals. 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
> an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., lnvitrogen, San Diego, Calif, U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1 555 ( 1 987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

1 0 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

1 5 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
hepai in-toy opearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein CMBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and lnvitrogen. 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of tJic invention include analogs (variants). This embraces fragments. 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides 01 

modifications of the polypeptides of the invention, wherem the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc.. as well 

10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 

sequences tested. Methods to determine identity and similarity arc codified in computer 
20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et aL Nucleic Acids Research 12(1);387 (1984); Genetics Computer Group. 

University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 

et ah, J. Moke. Biol. 21 5:403-410 (1990), PS1-BLAST (Altschul S.F. et aL, Nucleic Acids Res. 

vol. 25, pp. 3389-3402, herein incorporated by reference), cMatrix software (Wu et aL J. Comp. 
25 Biol., Vol 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (NevilL 

Manning et al, 1SMB-97. Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

(SonnhajTimcr et ah, Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 

reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 

105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 

Manual, Altschul, S. f et aJ. NCB NLM NIH Bethesda, MD 20894; Altschul S., et ai. ? J. Mol. 

Biol. 215:403-410 (1990). 

4,7 CHIMERJC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to nil or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

]5 the polypeptide sequences according to the invention comprises one or more domains me fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a hg;ind and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiate disorders, e,g.> cancer as well as modulating (e g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a iigand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. feds.) CURRENT PROTOCOLS IN MOLECULAR BiOi.OGY, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available lhat already encode 
a fusion moiety (e.g.. a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functionai gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See. for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedrnann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84(1990); and Miller, Nature, 357: 455-460(1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosoma) substrates (transient expression) or 

20 artificial chromosomes (stable expression). Ceils may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
oi inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the eel) These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in pail, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example. PCT International 
Publication No. WO 94/] 2650, PCT International Publication No. WO 92/20808, and PCT 

1 0 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA. amplifiablc marker DNA (e.g., ada, dhfir, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamy Iase.and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

15 co-amplification of the desired protein coding sequences in the cells. 

)n another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

3 0 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In nil cases, the identification of the targeting event may be facilitated by the use of one 
more selectable marker genes that are contiguous with the tcngeting DNA. allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Vims thymidine kinase (TK) gene or the bacterial 

10 xanthine-guanine phosphoribosyl-transt erase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,07 1 to Chappel ; 
U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US 92/09627 
(WO93/09222)by Seldcnet al.; and International Application No. PCT/US9O/06436 

15 (W09 1/06667) by Skoultchi et aL each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination art 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to de\emvne the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 

known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development , 

through, e.g.. homologous recombination or knock out strategies, of animals that fail to express 

5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 

models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 

invention in vivo, one or more genes provided by the invention are either over expressed or 

JO inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 7 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557.032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease slates. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, ar e produced using methods as described in U.S. Patent No 5,489,743 and PC T 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic anirnals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inaclivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4 JO USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators o- 
inhibitors) thereof would be beneficial to the subject in need ol treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DMA molecules, cloned genes and depenerate valiants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations o: 
domains thcreol), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly 01 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutivcly or at a particular stage of 
tissue differentiation or development or in disease stales); as molecular weight markers on gels: 
as chromosome markers or tags ( when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting: as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et aL Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity^ including in a panel ot multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids: as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or Iigands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
1 0 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed. r Cold Spring Harbor Laboratory Press* Sambrook, J., E. F. Fritsch 
1 5 and T. Maniatis eds., 1 989, and "Methods in Enzymoiogy : Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acict 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of i-. 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation ol cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent celJ proliferation 
assays for cell lines including, without limitation, 32D. DA2, DA1G, T10, B9 ? B9/1 1 , BaF3, 
MC9/G, M+(preB M+), 2E8, RB5 ? DAI, 12?. Tl 165. HT2, CTLL2, TF-L Mo7e. CMK. 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-ceJl or thymocyte proliieration include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Cohgan, A. M. Kruisbeek, D. H. Margulies, E 
M. Shevach, W. Strober. Pub. Greene Publishing Associates and WileyTnterscience (Chapter 3. 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19: Chapter 7, Immunologic studies in 

3 0 Humans); Takai et aL J- Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990: Bertagnolli et ah, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et ah, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, (hose described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interJeukin-y, Schrejber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and iMurine Interleukin 2 
and Interleukin 4, Bottom]y 7 K. ? Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVriesetal.,J. Exp. Med. 173:1205-1211, 1991; Moreau et aL Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991: Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11 -Bennett, F.. Giannotti, J., 
Clark, S. C. and 1 umer, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and T urner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cel) clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-ccII effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, V/ Strober, 



BNSDOCID: <WO 0153312A1_L> 



WO U1/5J312 P('T/'U$W)/34263 

Pub. Greene Publishing Associates and Wiley- Imerscience (Cnapter 3, hi Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7 \ 
immunologic studies in Humans); Weinberger et ah, Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et ah, Eur. J. Immun. 1 1:405-41 1, 1981; Takai et aL J. Immunol. 
5 137:3494-3500, 1 986; Takai ct al., J. Immunol. 140:508-512, 1988. 



4.10.4 STEM CELL GKOWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of piuripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaccuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 

proteins which currently must be obtained from non-human sources or donors, implantation of • 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for crafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF). leukemia inhibitory factor (LJF). Flt-3 ligand (Fit- 

25 3L), any of the interleukins. recombinant soluble 1L-6 receptor fused to 1L-6, macrophage 

inflammatory protein 1 -alpha (MJP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), pJatelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal suppori cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stern ceJJs themselves can be iransfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the .nvention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); King et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid arid an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al. Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 
biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement m regulating hematopoiesis. e.g. in supporting the growth and proliferation of 
erylliroid progenitor cells alone or in combination with other cytokines, thereby indicating utility 
for example, m treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroiri precursors and/or erythroid cells; in supporting the 
growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 
various platelet disorders such as thrombocytopenia, and generally for use in place of or 
complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 
paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradi at ion/chemotherapy, either in-vivo or ex- vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines are 
cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
and Cellular Biology 13:473-486, 1993; McCIanahan et al., Blood 81 :2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methyicellulose colony forming assays, Frcshney, M. G. In Culture of Hematopoietic Cells. R. I 
Fresimey, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., "New York, N.Y. 1994; Hirayarna et a!.. 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming celb 
with high proliferative potential, McNiece, f. K. and BriddclL R. A. In Culture of Hematopoietic 
Cells. R. ]. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss. Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-2 J, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. 1. 
Freshney. et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initialing cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, ct al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10,6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon. 

ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 

repair and replacement, and in healing of bums, incisions and ulcers. 
20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 

fractures and cartilage damage or defects in humans and other animals. Compositions of a 

polypeptide, antibody, binding partner, or other modulator of the invention may have 

prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
25 artificial joints. De novo bone fonnation induced by an osteogenic agent contributes to the repair 

of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 

useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells. 

stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 

periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 

inflammation or processes of tissue destniction (collagenase activity, osteoclast activity, etc.) 

mediated by inflammatory processes may also be possible using the composition of the 

invention. 
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Another category of tissue regeneration anivity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ljgament-like tissue 01 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-Jike tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendori/ligament-likc tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the an. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
svstem diseases, such as Alzheimer's, Parkinson's disease. Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, Jiver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or lor promoting the growth oi cells comprising such tissues. Part of the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection oi 
5 regeneration and treatment of king or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No 
WO91/0749) (skin, endothelium). 
15 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach. H. 1. and Rovee, D. T., eds.) ; Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCJD)), e.g.. in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation. Guillain- Banc syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft- versus-host 
disease and autoimmune inflammatory eye disease. Such a protein i oj antagonists thereof, 
including antibodies) of the present invention may also to be useful in :he treatment of allergic 
5 reactions and conditions (e.g., anaphylaxis, serum sickness, drug react :ons, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema. eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by m vivo animals 
models such as the cumulative contact enhancement test (Lastbom et a! , Toxicology 125: 59-66, 

15 1998), skin prick test (Hoffmann et a!., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Aich. Toxocol. 73: 501-9). and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may invoive preventing the induction of an immune 

response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantaiion. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a Jack of costimujaiion may also be sufficient 
to auergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA41g fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al. 7 Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed. ? Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/Jpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulaiing the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class 1 or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class 1 or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytopJasrnic-dornain truncated portion) of an 

1 5 MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class 1 or MHC class II 
proteins on the cell surface. Expression of the appropriate class 1 or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Irnrnunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Srrober, Pub. Greene Publishing Associates and 

30 Wiley-Interscicnce (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et ah, Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986;Takai etal.J. 
Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-ccIl-tier>endcnt immunoglobulin responses and isotype switching (which 
wil] identify, among other*, proteins thai modulate T-cel3 dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production. 
5 Mond. J. J. and Brunswick. M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1 -3.8.16, John Wiley and Sons. 1 oronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et aL, J. Immunol. 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988; Bertagnolli et aL, J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et ai., J. Immunol. 134:536-544, 1995; Inaba et aL, Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et aL, Journal of Immunology 154:5071-5079, 1995; Porgador et 
aL, Journal of Experimental Medicine 1 82:255-260, 1995; Nair et aL, Journal of Virology 
67:4062-4069, 1993; Huanp et aL, Science 264:961-965, 1994; Macatonia et a]., Journal of 
20 Experimental Medicine 169:1255-3264, 1989; Bhardwaj etaL, Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et aL, Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et aL, Cytometry 
25 13:795-808, 1992; Gorczyca et aL, Leukemia 7:659-670^ 1993; Gorczyca et aL, Cancer Research 
53:1945-1951, 1993; Itoh et aL, Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamaj et aL, Cytometry 14:891-897, 1993; Gorczyca et aJ., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et aL, Blood 84:1 1 1-1 17, 1994; Fine et aL. 
Cellular Immunology 155:1 1 1-122, 1994; Galy et aL, Blood 85:2770-2778, 1995; Toki et aL, 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACT1VIN/1NH1BIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characieristics. lnhibins arc characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH) r while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful a5 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example. U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

15 animals such as, but not limited to. cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale ei 
al., Fndocrinology 91:562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale et al., Nature 

20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACT1C/CHEMOKJNETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-ceJIs, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability 10 directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for cbemotactjc activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
1 0 M. Kruisbeek, D. H. Marguiles, E. M. Shevachu W. Strober. Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APM1S 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994: Johnston et ah J. of Immunol. 153:1762-1768, 1994. 

35 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:13 1-1 40, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et ah, Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.HUJ CANCER DIAGNOSIS AM) THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 
5 Cancer treatment promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiopenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

10 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

15 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urolopic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancels including intrinsic brain tumors, 
neuroblastoma, astroevtic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers inchiding osteomas, skin cancers including malignant melanoma, 
I umor progression of human skin Jc era finocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and KarposPs sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumoi size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase. Bleomycin, Busulfan, Carboplatin. Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Drjcarbazine, Dactinomycin. 
Daunorubicin HCL Doxorubicin HC1, Esnamustine phosphate sodium. Etoposide (VI 6-2] 3). 
Floxuridinc, 5-FIuorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxvcarbamide), Ifosfamide, 
5 Interferon A3pha-2a, Interferon Alpha-2b. Leuprolide acetate (LHRH-reieasmg factor analog), 
Lomustine. Mechlorethamine HC1 (nitroeen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HCL 
Streptozocin, Tamoxifen citrate, Thioguanme, Thiotepa, Vinblastine sulfate. Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmejamine, lnterIeukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposidc, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vilro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella el al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1 999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable rumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LJ G AND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach. W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28. 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987: Bierer etah, J.Exp. Med. 168:1145-1156, 1988; 
Rosenstein et ah, J. Exp. Med. 169:149-160 1989; Stoltenborg et ah, J. Immunol. Methods 
175:59-68, 1994; Stittet a]., Cell 80:661-670,1995. 
1 5 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BlAcore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimerric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimerric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 



4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested 01 examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line. 
5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e.. 
increase or decrease) the activity of polypeptides of trie invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are rnicroorgariisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures foT 

1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non- naturally occurring) variants thereof For a 
review, see Science 252:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides oi 

20 organic compounds and can be readily prepared by traditional automated synthesis metbods. 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparalle! synthetic collection, recombinatorial. and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin 

25 BiotcchnoL 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et aL, Mol Biotechnol, 9(3):205-23 (1998); Hruby et aL, Curr Opin Chem BioL 
1(1);1 14-19 (1997); Dorner et aL, BioorgMed Chem, 4(5):709-l 5 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 

30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g.. ricin or 

35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted 10 a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
compJexed with imaging agents fbi targeting and imaging purposes. 

5 4.10.1 4 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly usetv.l for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

1 0 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BlAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including. (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
ami -inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of ceils involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by cumulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

] 0 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induccd lung injury, inflammatory bowel disease. Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, infiamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of s 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co.. Philadelphia). 

30 

4.10.37 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyclination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but arc not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex vims or with Lyme disease, 

1 5 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

2b (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo. 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 

choline acetyltransi erase or acetylcholinesterase with respect to motor neurons; oi 
(jv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515): increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of nemon-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured: and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio- Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 

4.10J8 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of. or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or chradian cycles or rh>1hms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 

nutritional factors or component(s); effecting behavioral characteristics, including, without 

limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 

5 reducing effects: promoting differentiation and growth of embryonic stem cells in lineages other 

than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 

deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

hyperproliferative disorders (such as, for example, psoriasis); immunoglobuJin-Iike activity (such 

as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 

10 in a vaccine composition to raise an immune response against such protein or another material or 

entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharrnacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCRmay be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 

61 



BNSDOCIO. *WO 0153312A1J. > 



WO 01/53312 PCTAJS00/34263 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternaiive. any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes fvom those sequences. 

Alternatively a polymorphism resulting in a change in the ammo acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983 r 
Science. 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally inrradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
] -5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 1 4, 15, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important., parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage wijj vary according to the age r weight. 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01 Hg/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1 Mg/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical!}' acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient 
The preparation of such solutions is within the skill of the art. 

15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carrier? or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fibers, salts, buffers, stabilizers, solubiJizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokincs. or other hematopoietic factors such as 
M-CSF, G3V1-CSF, TNF, IL-1, 1L~2 ? IL-3, IL-4, IL-5, IL-6, IL-7, 1L-8, IL-9, IL-10, IL-11, 1L-12, 
1L- 13. IL-R IL-15, IFN, TNF0, TNFI, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-ct and TGF-p). insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional iaciors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of thr 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine. other 
hematopoietic factor, thrombolytic or anti-tlirombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, 1L-1 Hy], IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multjmers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invenuon may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention oj 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to o 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s). tliromholytic or antithrombotic factors, or sequentially. If administered sequentially. 

the attending physician will decide on the appropriate sequence of administering protein or other 

active ingredient of the present invention in combination with cytokine(s), lymphokine(s). other 

hematopoietic factor(s). tliromholytic or anti-thiombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

1 0 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present inveniion used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician. to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutical- These pharmaceutical compositions may be 
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manufactured in a manner that is itself" known, e.g., by means of conventional mixing 

dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

iyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 

When a therapeutically effective amount of protein or other active ingredient of the present 

5 invention is administered orally, protein or other active inpredient of the present invention will 

be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

the pharmaceutical composition of the invention may additionally contain a solid carrier such as 

a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 

other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

1 5 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cuianeous or subcutaneous injection, protein or 

20 other active ingredient of the present invention wiJJ he in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isoionicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as HanJks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, drawees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipicnt, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as. for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose. and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or aJginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone. carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyesruffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push- fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional marine] . 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil. or synthetic fatty acid esters such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

1 0 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g.. sterile pyrogen- free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 

1 5 retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such Jong acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 

20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 

25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80. and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 

30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

co-solvent components may be varied: for example, other Iow-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrroJidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 

35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
scmipcnneable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient siabi]i2ation may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutical^ compatible counter ions. Such pharmaceutical^ 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
inonoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protcin(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond lo antigen through theT cell receptor (TCR) following 

25 presentation of the antigen by M11C proteins. M.T1C and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutical^ 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formujation include, without limitation, monoglycerides, diglycerides. 
suJfatides, lysoiecithms, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art. as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4 r 50L728; 4,837,028: and 4,737.323, aJl of which are incorporated 
5 herein by reference 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
1 0 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 

1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 fig to about 100 mg (preferably about 0.] jug to about 10 mg, more preferably 
about 0.1 ug to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-frec, physiologically acceptable 
iorm. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 

25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 

30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 

35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, poiyglycolic acid and polyanhydrides. Other potential materials 
aie biodegradable and biologically well-defined, such as bone or dermal collagen. Furthci 
minnccs are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite. bioglass. 
aluminateS; or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcmm-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
3 0 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

1 5 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of cai boxy methylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In furthej 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-oc and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue., the size of a wound, type of damaged tissue {e.g.. 
bone), the patient's age. sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as )GF 1 (insulin like growth factor 1), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-iays, histomorphometric determinations and tetracycline 
labeling. 

1 0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 S proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.J2.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is we]] within the capability of those skilled in 
the art. especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration langc that includes the IC50 as determined in cell culture {i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and KD50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975. in "The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.l . Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 

1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 30-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 

20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 u.g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 Mg/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 

25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds Cimmunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab , Fab* andF< a b-)2 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes JgG, JgM, JgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as vvell, 
such as IgGj, lgC*2r and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 1 0 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Dooliule or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyle and Doolittle 1982, J. 
Mol. Biol. 1 57: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homoJogs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecificaJly bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
1 0 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual. Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press. Cold Spring 
Harbor, NY. incorporated herein by reference). Some of these antibodies are discussed below. 

15 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal beiug immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral geJs (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target ol the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 

purify the immune specific antibody by immunoaffinity chromatography. Purification of 

immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 

Scientist, Inc.. Philadelphia PA. Vol. ]4 ? No. 8 (April 17, 2000), pp. 25-28). 

5 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 

1 0 gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

1 5 described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 

20 protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Coding, Mon oclonal Antibodie s: 
Princi ples and Practice, Academic Press, (1 986) pp. 59- 1 03). Immortalized cell lines are usually 

25 transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma ceils can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 

30 the bybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 

35 can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Cuirure Collection. Manassas. Virginia. Human myeloma and 
mouse-human heteromyeloma eel] lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol. , 133:3001 (1984); Brodeur et al., Mon oclonal 
Anlibodv Production Techniques an d Applications , Marcel Dekker, Inc.. New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (R]A) or 
10 enzyme-linked inimunoabsorbent assay (EL1SA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can r for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Bioche ro., 107:220 (1980). Preferably, 
antibodies having a hiph degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example. Dulbecco's Modified Eagle's Medium and RPMM640 medium. 
Alternatively., the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,8 1 6,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells. Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,81 6,567; Morrison, Nature 368, 
812-13 (1994)) or by covaJently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substiaited for the constant domains of an antibody of the invention, or can 
be sxibstituied for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 



5 5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can farther comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

1 0 immunoglobulin chains or fragments thereof (such as Fv. Fab, Fab', F(ab')2 or other antigen- 
binding subsequences ol antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et a)., 
Nature , 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et aL, 

1 5 Science, 239 : 1 534-1 536 ( 1 988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially al] of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et al., 1986; Riechmann et aL 1988; and Presta, Curr. Op. Struct. BioL, 
2:593-596(1992)). 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et aL, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. ProcNatl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, AJan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J.Mol Biol., 227:381 (1991); 
Marks el ah, J. M ol. Biol. , 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545 : 806; 5,569 ; 825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology HX 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et aJ,( Nature 

1 5 Biotechnol ogy 1.4, 845-51 (1996)); Neuberger (Nature Biote chnolog y 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol 13 65-93 (1995)). 

H\iman antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 

20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the hosf s genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 

25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fulJy human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 

30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host exemplified as a mouse T lacking 
expiession of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5.939,598. li can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent f ormation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vecior containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

10 U.S. Patent Mo. 5.916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCX publication 
WO 99/53049. 



20 5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g.. 
Huse, et ah, 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( Q b72 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F( a b*)2 fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispeeific Antibodies 

Bispecifjc antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface proiein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. I radii ionaUy. the 
recombinant production of bispecifje antibodies is based on the co-expression of two 
5 immunoglobulin heaw-cham/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

1 0 chromatography steps. Similar procedures arc disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker el al. t 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

1 5 the hinge, CH2. and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh el 

20 ah, Method s i n Enzymology, 121:210(1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which arc 
recovered from recombinant cell culture. The preferred interface comprises at least a pari of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab , >2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconvened to the Fab r -thiol by reduction with mercaptoethylaminc and is 
mixed with an equiinolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coJi and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab , )2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostclny et aL J. Im munol. 148(5): 1 547-1 553 (1992). The 
leucine zippei peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Zollinger et al., P toc. Natl. A cad . Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V^) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and \\ domains of one fragment are forced to pair with the complementary Vl and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See. Gruber et al., J. Immun ol. 152:5368(1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3 r CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRJJ (CD32) and FcyRJJ] fCDl 6) so as to focus cellular 
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defense mechanisms to (he cell expre* - : ing the particular antipen. Bispecilk antibodies can also 
be used to direct cytotoxic agents to c< lis which express a particular antigen. These antibodies 
possess an antigen-binding arm and an ami which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA. I)OTA. or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described he. 1 t in and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies arc also within the scope of the present invention. 
Heteroconjugatc antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of K IV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies i an be piepared in vitro using known methods in synthetic 
protein chemistry, including those involving crossJinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 

mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5J3.7 Effector Function Engineering 
It can be desirable to modify the antibody of the invention with respect to effector function, so as 

20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Ic region, thereby allowing interchain disulfide bond 
formation in this region. The homodn-jeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADC< "). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 

25 and Shopes, J. Immunol., 148: 2918-2 ( ;22 (1992). Homodiraeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby h*ve enhanced complement lysis and ADCC capabilities. 
See Stevenson et ah, Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 lmmunoconjugatcs 

The invention also pertains to mimunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherape utic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e.. a 
35 radioconjugate). 
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Cheraotberapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin. 
5 Aieurites fordii proteins, cianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP>S) ? momordica charamia inhibitor, curcin, crotin. sapaonai ia officinalis inhibitor, geionin. 
mitogellin, restrictocin, phenomycin. enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, 'WY.and 186 Re. 

10 Conjugates of the antibody and cytotoxic agent are made using a variety of Afunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithio]) propionate (SPDP), 
iminothiolane (IT), bifunctjonal derivatives of imidoesters (such as dimethyl adipimidate HCL) r 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde). bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

1 5 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et a]., Science, 238: 1098 (3987). 
Carbon- 14-Iabeled l-isothiocyanatobenzyI-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "Hgand" (e.g., avidin) (hat is in turn 

23 conjugated to a cytotoxic agent. 



4J4 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media 1 ' refers to 

30 any medium which can be read and accessed directly by a computer. Such media indude, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM: electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

3^ be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods ior recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer icadable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

) 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2 ? Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing airy of the nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-I786 and 3573-5358 m computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Compute j 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST(AltschuI et al. ? J. MoJ. Biol. 215:403-430 (3990)) and 
BLAZE (Brutlag et aL Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such OlU's may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

* 

As used herein. "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded lhereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means 11 refers to one oi more programs which are implemented 
on the computer-based system to compare a tarpet sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

] 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (N POLYPEPTIDE! A). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif" or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of tarpet motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designee 10 be complementary to a region of the pene involved in transcription (triple helix - see 
Lee et a'.. Nucl. Acids Res. 6:3073 (1979): Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251 :1 360 (3991)) or to the mRNA itself (antisense - Olmno, J. Nenrochem. 56:560 
(1991): ( )iigodeoxynudeotides as Antisense Inhibitors of Gene Expression. CRC Press, Boca 
5 Raton, 11 (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA. while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypep:i(ie. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisensi or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

Tnc present invention further provides methods to identify the presence or expression of 
one of th< ORFs of the present invention, or hornolog thereof, in a test sample, using a nucleic 
acid proht or antibodies of the present invention, optionally conjugated or otherwise associated 
1 5 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected. polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal 10 a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected m the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient 10 form the complex, and detecting the complex, so that il a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding ci the nucleic acjd probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in ;ne an will recognize that any one of the commonly available hybridization, 
35 amplified! on or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention Examples oi such assays can be found in Chaid r 
T.. An Introduction to Radioimmunoassay and Related Techniques, Elseviei Science Publishers. 
Amsterdam, The Netherlands (1986): Bullock. G.R. et aL Techniques in Immunocytochemistry, 
Academic Press. Orlando, FL Vol. 1 (1982). Vol. 2 (1983). Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology. 
Elsevier Science Publishers, Amsterdam, The Netherlands ( 1 985). The test samples of the 
present invention include cells, protein or membrane extracis of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The lest sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 

10 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known jn Ihe art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present mvention> kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 

1 5 provides a compartment kit to receive, in close confinement . one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of delecting presence of a bound probe or antibody. 

In detail, a companment kit includes any kit in which reagents are contained in separate 

20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
oi paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 

25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 

30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4. J 7 MEDICAL IMAGING 



ENSDOCID: <WO 0153312A1 J_> 



WO IH/53JJ2 PCT/USOD/34263 
The novel polypeptides <uid binding partners of the invention are useful in medical 
imagine of sites expressing the molecules of the invention (e.g.. where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel ct ah, U.S. Pat. NO. 5.413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutical^' acceptable car rier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 
1 0 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 

encoded by an OR.F corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l - 

1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 

acid. In detail, said method comprises the steps of: 
1 5 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the apent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compound? that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compoimd complex, and detecting 

the complex, so that if a polynucleoticle/compound complex is detected, a compound that binds 

to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of tin- compound). Alternately, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability 10 modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical apents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohy drates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORP of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides." In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to (he foregoing, one ejass of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting? the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific apents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodi ester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably comain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl Acids Res. 6:3073 (1979); Cooney et ah, Science 241 :456 (1988); and Dervan et 
ah, Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleorjdes as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton. FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DMA, while amisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both Techniques have been demonstrated to be effective in model systems 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DN A binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention car. 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORTs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived irom any of the nucleotide 
sequences SEQ ID NO: 1-1 786 and 3573-5358. Because the corresponding gene is only 
• 1 5 expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may- 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among ether places, in Venna et a! (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal pieparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Jtxamples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagata et al, ) 985; Dahlen et al, 1 987; Morrissey 8c Collins, ( 1 989) Mol. Cell 
Probes 3(2) 1 89-207) or by covaJent binding of base modified DNA (Keller et a!., 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavi din-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal. Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (NapervilJe,IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed CovaJink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by aphosphoramidatebond, allowing immobilization of more than 1 pmoi ofDNA 
(Rasmussenera/., (1991) Anal. Biochem. 198(1) 138-42). 



WO 0J/53312 PCT/USOO/34263 
The use of CovaLink NH strips for covalent binding of DNA moJecuics at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphor amid ate bond is employed 
(Chu et al., (1 983) Nucleic Acids Res. ] I (8) 65 3 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The pbospboraroidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covaJently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covaJently bound to CovaLink and 
then streptavidin used to bind the probes. 

10 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul)and 

denaturing for 10 min. at 95°C and cooling on iceforlOmin. Ice-cold 0.1 M 1 -methyl imidazole, 
pH 7.0 ( 1-Melm 7 ). is then added to a final concentration of 10 mM l-Mehrv?. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-ethyl-3-(3-dimethylaminopropyl)-caibodiimide(EDC) ) dissolved in 

15 lOmM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g.. Nunc-lmmuno Wash: first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

Jt is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3 '-reagent through the phosphate group by a covalent phosphodiestcr link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotidesdirectly on a glass surface, as described by 

30 Fodor et al (1 991) Science 251(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic Acids Res. 
1 9(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (3 988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oiieonucleotioeto a nylon support, as described by VanNe^s cj al (1991), 
requires activation of the nylon .surface via aikylation and selective activation of the 5 -amine of 
oligonucleotides with cyanuric chloride 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease el ai, (1 994) PNAS USA 91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
S'-protected/Z-acyJ-deoxynucleosidephosphoramidites^urface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomjc 
DNA, chromosomal DNA, microdissected chromosome bands, cosinid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et ai ( 1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as dones in M 1 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Sampler 

20 may be prepared or dispensed in multiwell plates. About 1 00-1000 ng of DNA samples may be 
prepared in 2-500 mJ of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook el 
ai (1989). shearing by ultrasound and NaOl l treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al. (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A levej 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the r wo 
base recognition endonuclease, Cv/Jl, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(34) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endo nuc lease Cv/Ti normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme ( Cv/J] * *), yield a quasi-random distribution of DNA fragments form the sma)] 
molecule pIJC 19 (2688 base pairs). Fitzgerald el al ( 1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/J] ** digest of pUCl 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts pyGCPy and 
1 0 PuGCPu. in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DM A arc required (0.2-0.5 ug instead of2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresisand elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give sing Ie stranded pieces available for hybridisation. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-9O°C. The solution is then cooled 
quickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of welJs in a microtiter plate) to repealed by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , dependmg on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96- well piate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarcays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC. Naperville. Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane. the grid 
being similar to the sort of membrane applied to the bottom of mulliwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to Hat phosphor- storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in the scope of the present invention. Accordingly, it is intended that the broadei 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as iliustrationsof single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nuclc ic Ac id Sequences Obtained From V arious Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR r SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers)to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Singlepass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 

A ss embla ze of Novel Nucleic Acid ? 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional seunences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 114, gb pri 114, and UniGene 
version 1 01 ) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ IDNO:3573-5358as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
ht tp://fastabioch.virginia.e diO which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1990) ? herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nu cleic Acids 

Using PHRAP(Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shirts and incoirect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrapajid Consed (University of Washington) and 
ed-ready,ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1- 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: I -327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the elvlatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 
UniGene version 1 1 7, Genpept release 1 1 7). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ 1DNOS: 328-1413. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 
The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 
version 2.0al 19MP~WashU search against Genpept release 1 1 8, using BLAST algorithm- The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrjx software package (Stanford University, Stanford, CA) (Wu et ah. J. Cornp. 
Biol. ? Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-vahie(s) and (he position(s) of the signature within the polypeptide sequence. 

Using the pFarn software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 . examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFarn score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? Vl.l program (from 

1 5 Center for Biological Sequence Analysis, The T echnical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Bmnak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering. Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score arid a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 53.2 EXAMPLES 

Novel Nuc) ci c_Aci ds 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

UniGene version 1 1 7, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready,ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 3 4 1 4- 1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. 
The nearest neighbor results for SEQ 1E> NO: 1414-1652 were obtained bv a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118. using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 14) 4-] 652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ JD NO: 14 14-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et a]., J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

20 Center for Biological Sequence Analysis. The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1 , pp. 1-6 (1997), incorporated herein by 

25 jekrence. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e.. dbEST version 1 18, gb pri 118, 

300 



GNSDOCID: <WO 0153312A1J_> 



WO OJ/55312 PCT/US00/34263 
UniGene version 1 i 8. Gcnpept release ] J 8). Other computer programs which may have been used 

in the editing process were phredPhiap and Consed (University of Washington) and ed-ready. ed- 

ext and gc-zip-2 (}■] yseq. Inc.). The full- length nucleotide sequences, including splice variants 

resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1745. 

5 Table 1 shows the various tissue sources of SEQ ID NO: 1653-1 745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 

1 9MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The results showed 

homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 

which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 

1 0 with identifiable functions for SEQ ID NO: 1 653-1745 are shown in Tabic 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 0 pp. 219-235 (1999) herein incoiporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

1 5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et a]., Nucleic Acids Res., Vol. 26(1 ) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 

20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites'' Protein Engineering, Vol. 10. no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using P1IRAP (Univ. of Washington) or CAP4 (Paracel) 7 a full length gene cDNA 

sequence and its corresponding protein sequence were generated lrorn the assemblage. Anv frame 

.shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 19, gbpri ] 3 £. 

3 Uni Gene version 1 1 9, Genpept release ] 1 9). Other computer programs which may have been used 

in the editing process were phredPhrapand Consed (University of Washington) and ed-reariy. ed* 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 

these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 

1 0 The homology for SEQ ID NO: J 746- 1 768 were obtained by a BLASTP version 2.0al 

1 9MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologies 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

1 5 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature vvithin the polypeptide sequence. 

20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1 ) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process lor 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication u 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 snows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE S 

Novel N ucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (ParaccJ). a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop corions were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 1 20). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready,ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
10 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ 1DNOS: 1769- 
1786 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 

The homology for SEQ ID NO: 1 769-1786 were obtained by a BLASTP version 2.0al 
1 5 1 9MP-WashU search against Genpept release 120 and the amino acid version of Genescq 
jeleased on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
20 Biol.. Vol. 6 pp. 21 9-235 ( 1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they bad identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam sofi ware program (Sonjihammer et aL Nucleic Acids Res., VoJ. 26(1) 
25 pp. 320-322 (J 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
w ithin the sequence. 

'JTie nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by HenrikNielson. Jacob Engelbrecht. Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites'* Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 (1 997), incorporated herein by 
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jeference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signa] 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLEJ^ 

"Tissue Origin" "|"' RN/- Source 



adult brain 



GIEC< 



Hyeec: 
Library Name 



SEC 3D HCL- 



AB300. 



adult brain 



G1ECC. 



ABD00 3 



S 19-21 50-51 65-66 
65 67 1C7-308 113 11 
140 150-352 159 169 
702-203 212-214 225 
25^ 258 2^8-269 272 
298 303 321 326 331 
357 362 369 375 382 
443 459-460 473 475 
500 503 529 526 5«7 
606-609 633 638 633 
652 657-658 660 669- 
695 697 710 735 724 
796 804 811 857-859 
900 912 519 922 924- 
962 979 988-985 996 
1008 1016 1039 1047 
1067 1070 1078 1082 
3136-1117 -131 1134- 
1149 1151 1157 1180 
1234 1241 1243 1258 
1279 128b-1290 1294 
1312 1320 -323 1330 
1361 1368 1373-1375 
1400 1417 1446 1468 
1494 1S01-1503 1506- 
1537 1522-3524 153 0 
1549 1565 1578 1598 
3623 1625 1627 1639 
1649 1653 1664 1667 
1734 1741 1743-3744 
177: 



72 78 so e: 

6 123 I3t 
177 192-193 
226 235-236 
280-281 25E> 
332 334 350- 
383 416 423 
477 4 88 4 96 
574 582 5f 
634 645-646 
671 678 667 
731 775-777 
862 869 899- 
929 93 3 Si 6 
1001 1004 - 
1059 106< 
1107 1113 
1137 1140 
1206 122$ 
1272-1273 
13C7-1306 
1356 1360- 
1379 1391 
1482 1493 
1507 1532 
1533 1537 
1606 160C 
1643 1646 
1671 1696 
1760-1761 



3" 12-14 16-39 25 30-31 34-36 43- 
45 50-51 56 58 60 65-66 68-69 60 
82 65 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-13' 
139 142 146 148-349 152 154 157 
155 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
286 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-38: 
393 401 408 414 419 424 426-426 
430 433-436 438-439 443 445 445 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-5Z7 534 536- 
540 542-543 545 553 555 560 565- 
570 574-576 586-588 593 595 597 
601 606-609 616-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 8lS 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-99? 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-10B6 1089 
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BNSDOCID <WO 01S3312A1J. > 



WO dl /53312 



hsuc Oricir. 



RNA Source 



rt;Jlt brain • Clontech 



acuK brain t Clontech 



arivilt brain 



Cl ohtech 



HV3GC 

Library Karas 



ahroo: 



ABRQ06 



AER008 









ID NOS ; 




1097 


1 103 


lie - ; 


1109 


1112 


"mc- 


113 7 


1119 


112j 


112 4 


1127 


: : 30 


1134 


I 144- 


13 4^ 


1349 


us: 


-157- 


11S8 


1 167 


117C 


1178 


1184 


1188 


1190 


1193- 


1 1 9 <j 


1200 


1202 


: 2 1 1 - 


1217 


1 220 


122f - 


1227 


1229 


1231 


1241 


1243 


1247 


1252 


i2be 


:?63 


1267 


1269 


127i : 


1281 


1284 


1286- 


1289 


1293- 


1294 


1306 


-1307 


J312 


1316- 


1320 


13 2( 


1333 


1338 


1341 


1344 


134 8 


13 CO 


1355-1357 


1368 


1374 


1377 


1380 


1386 


1389- 


1390 


1394 


1400 


1405 


1414 


1422- 


1423 


1425- 


14 27 


1437 


1443 


1446 


1454 


1456 


1458- 


1455 


1468 


1470- 


1472 


1478 


1482- 


1483 


1487 


-1486 


1493 


1497 


7 4 59 


150f 


1508 


-1511 


1527 


1527- 


1524 


1531 


1533 


1545- 


1S46 


1548- 


1550 


1552 


15S7 


-1559- 


1563 


1565 


1567 


1S6S 


1571 


1586 


1S88 


1591 


1593 


1S9£ 


1598 


-1601 


16 oe 


1611 


1620- 


-1621 


1624 


-1626 


1628 


1630- 


1632 


163C 


1640 


-1641 


1644- 


1645 


1647 


1649 


1653 


-1655 


16 57 


1664 


1667 


1669 


1673 


1678- 


1681 


1686 


1690 


1694- 


1696 


1701 


17C9 


1711 


1719 


1722 


1723 


1726- 


1727 


1731- 


1733 


173E 


1740 


1743 


174 4 


1747 


1749 


1753 


1757 


-1758 


1760- 


1761 


1765 


1771 


1785 






9~2Sf 


6 8-69 113 


115 


146 152 2 36 



223 245 277 307 320 324 230-331 
344 348 352 362 379 384 293 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 693 
715 799 003 833 865 071 675 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 2144- 
3147 1231 12361239 1280 1293 
1320 1345 3355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609 1610 1614 1020 
2626 1645 1653 1754 1759 3770 
1786 



5^8 15-16 168 il2-213 271 27b 
280-281 291-29; 300-303 310 314 
321 326 336-33f 341 352 357 359- 
360 362 369 371 379 384 393 396- 
397 434 419-42C 426-428 430 441- 
442 453 506 616-617 661 669 785 
798 845 1018 1.109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
2557-1559 1586 1588 1651 3653 
1664-1665 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1/36 1740 -743-1744 1757 
1760-1761 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
60-70 72 75 77-80 83 85 69-92 94 
99-105 108-110 112-113 116-117 
223 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 183-164 188-190 193- 
194 196 198-20C 202 204-205 207- 
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BNSDOC1D: <WO__ 0153312A1_I_> 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin j RN A Source 



Hysec 
Li or ary Rame 



S2Q ID N'OS: 



206 210 214-215 218 221-226 221 
231-232 234-241 245-247 251-25? 
2SF 257-259 268-269 271 276-281 
28S-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 32S-326 328 330-331 333-336 
341 344-347 349 352 354 356-357 
362 369-373 376 379-380 382 384 
387 390-351 393-394 397 399-402 
405-411 414-415 417-420 426-42B 
437-438 440-444 4S3-455 462 464 
467 469-471 476 478 492-404 486- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 581 5eS 587-586 590- 
591 597 595 601-602 606-610 612 
615-U7 619-620 622-623 628-629 
631 G33-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 756-759 
762-764 766 768 773-778 780-782 
734-/85 787-789 794 796 799 802- 
803 805 811 814-815 818 82S-82t 
834-837 839-840 842-843 656-859 
861-862 865 8G7-B72 874-875 86: 
883-E84 887 889-892 894-B95 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 968-990 992 997 999-1002 
2004-3006 1008 1012 1018-1022 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 106E 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 116C 
1163 1167 1169 1172 1175 1177 
1180 1183-11B8 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1233 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1353 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 13B6-3388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1460 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 1689-1690 
1694-1696 1704-1705 1708-1709 
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BNSDOC1D: <W0 0153312A1_I_> 



WO 01/5331? 



PCT/USlM)/34?63 





RNA Source 








Libra* 



adult br^ir. 



ecult brain 
"adult "Braarr 



adult brain 



= ault bram 



adult brnm 



adult brain 



Ci on tec h 



BioCham 



lnvitrogen 



~ hS=h ■ 



ABa; 



ABT": 



lnvitrogen ; AE3' 



lnvitrogen 



Invi trogen 



ABfU : l . 
ABR: . f~~ 



invi trogen 



AM'J't i. *■ 



cul tured 
preadipocytes 



Strategene 



ADPOC 



SEQ 3D NOS 



1720-1724 1726-3728 1730-1733 
1737-1740 1742-1745 37S3 175C- 
1757 1752-1761 1765 1767 1771- 
177 2 17 76-1777 1779-1780 1 7B6 
~2A~7 1 T~ 102 186 210 310-331 ~3G4~~ 
36S 508 623 710 937 1002-1003 
1055 1204 1609 1731-173; 



46 182-184 204-205 300 739 767 
1372 1549 1620 168« 



185 204-205 364-365 3 93 497 595 
687 692-694 830 £45 1068 1320 
1413 164C 



187 101 357 364-265 375 454 463 
731 859 939 983 1073 1262 1270 
1320 3403 1640 1651 1657 i696 

1722 173e 

419 134-435 441-442 
1320 



763 789 963 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



14-16 22-2 3 25 37-3 9 43~T8'S0 
70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 1S8 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 4 76 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 785 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
963 963 96fc-969 988-989 10CC 
1005-1006 1016-1019 1021 3036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-:124 3336- 
1137 1140 1144-1147 1151 3167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 3448 3451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 3536 1547 
1554 1557-3559 1561-1562 1567 
1595 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 3624 
1627 3640 3644 1647 3660 1664 
1666 1670 3675 3696 3704 1735 
1723 1727 1738 1760-1761 1768 
1779 1785-1706 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 530 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-996 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
368 171 188- 
276 288 293 
379-380 391 
-512 520 524 
618 620 622 
-682 710 731 
834-836 843 
893-895 934 
1000 1002 
1028 1032 
3C97 1099- 
1212-1220 
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BNSDOCID: <WO 0153312A1_I_; 



WO 01/53312 



PCT/US00/34263 



Tissue Oriyin 



adrenal gland 



adult heart 



RNA Source 



Hyseq 
L;rary Name 



CI on tech 



ADR002 



GIBCO 



AHK001 



SEQ 3D KOS: 



1260 127: 
1322 1325 
1370-1373 
1437 1466 
1602 1606 



297-1198" 1314 1320 
133? 134$ 136S036* 
1396 1408 1423 243; 
1468 1S33 1539 1594 



.614 



1650 
1696 



1631 1649 
1660 1662 3673 1687-1688 
1711 171S-1720 11<2 1746 174* 
1760-1761 1765 1767 1771 1785 
4-10 ISO." £" 55" 2S -32 4 3~-~4 ~S 4 7 S C - 
51 55 60 62-63 6S-66 75 80 10* 
116 118 122 126 130 137 ISO 1C9- 
170 181 192 198 201-203 215 227- 
226 247 251 255 267-269 271 280- 
281 285 255 298 3)1 336-338 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-490 503 516 
SlS 527 S35 546 549 552 572-573 
581 588 5S5 600 602 608-610 620 



628-630 637 
713 715 719 



645-646 670 679 703 
732 734 744-746 758 



773-778 789 816 829 837 845 848 
869 875 883 698 904 912 522-923 
930-931 542 948 952 965 567 969 
976-977 901 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112 1113 1115 1121 1127 
1134-113S 1151 115B 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1375 
1387 1396 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 14S1 1507 1512 
1538 3546 3567 1573-1575 1586 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 3674 3678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



4-8 10-11 15-16 18-21 34-39 44- 
46 50-52 57-58 60 62-63 71 75 82 
85 67 89 94 97 100 103-104 106- 
110 112 114 116 118-119 122-123 
127 130-132 134 136-138 141-144 
147-151 153 163064 168-171 179 
186 192 195 197 199 204-205 212- 
215 220 225-226 229-230 232 234- 
236 251 257-260 262 265 272 274 
277 280-292 285-266 289-252 296 
298-301 304 307 309 314 321 324- 
325 330 333 336-338 345 349 351- 
352 354 35e 361 368 370 380 383- 
384 387-398 391 393 397 401 406 
40B-4O9 411-412 414-416 430-431 
433-439 445-446 449 452 454-455 
457 459 462 469 472-473 476-48C 
483-464 487-490 492-493 496-498 
503 506 508 510-513 516 519-522 
526 534 536-540 542 546 549 553 
560-562 574-577 581-582 584 586- 
587 589 593 S95 597 604-609 611- 
612 615-620 622-623 626 632 637 
64S-652 656-660 665-666 670-672 
674-675 633-684 687 692-694 697 
701 709 712 715-716 719-720 72S- 
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BNSDOCID: *WO 0153312A1J.; 



WO 01/53312 



PCT/USnO/34263 



[ Tissue Origin "kNA Source " j Hyseq 

j Library Name 



adult kidney 



G3BCO 



AKD001 



SEQ ID NCS : 



726 728 730-732 73 S 
744 746 751 753 759 
771 775-780 78S 788- 
604 810 612 817 821 
6 37 843 845- 847 849- 
£63-864 869 871 875 
683 887 890-892 894- 
901 903 506-907 911- 
521-925 927-928 933- 
561-963 967 969-972 
980- 986 990 992 999- 
1007 1010 1016 1019- 
1023 1025 1028-1037 
1043 1047 105O 1054- 
1059 1063-1064 1G67- 
1072 1075-1076 1083 
1089 1093-1094 1104 
1109 1113 1116-1117 
1124 1126 1128 1131- 
1145 1148-1149 1151 
1169-1170 1175 1177 
1199-1200 1202 1206- 
1216 1218 1222 1227- 
1235 1238-1241 2243- 
1248 1250 1253-1254 
1261 1268 1270-1271 
1282 1287 1292 1298 
1308 1317-1321 1324- 
1332 1334-1337 1339 
1349-1350 1354-1356 
1365-1366 1369 1371 
1378-1380 3383-1384 
140O 1403 1409 1417 
1437 1439 1442 1444 
1450 1453 1468 1470 
1481 1488 1490 1501- 
1521 1524 3528 1530- 
1537 1539 1541-1542 
1555 1560 1565 1567- 
1591 1597-1598 1601- 
2614-1616 1619-1620 
1630-1632 1634 1636 
1645 1647 1649 1652- 
1662 16G7 1673-1674 
1684 1686-1688 1704- 
1711-1712 1717 1724 
1731-1733 1737-1738 
1744 1749 1754-1755 
1765 1772 1785 



736- 739 7TT: 
761 765 770 
790 796 8C2 
826 828 830 
853 857-661 
877-879 881 
895 897-858 
913 915 S15 
935 945 958 
975 977-976 
1002 1005- 
1020 1022- 
1039-1040 
1055 1057 
1068 1070 
1085-1087 
1106 1109- 
1119 1121 
1134 1144- 
1158 1167 
1192 1196 

1208 :2i : 

X229 1232- 
1244 124 7- 
1256-1258 
1277 1280- 
1299 1306 
1325 1330 
1344-134 5 
1359-1360 
1374- 137 h 
1389 1397 
1423-142€ 
1446-1447 
1473 1475- 
1504 1515 
1534 153* - 
1547 255/ 
1571 358f 
1602 1601 
1623-3626 
1641 2644- 
1655 1655 
1680-168'- 
1705 170S 
1726-1727 
1741 174 > - 
1760-17€i 



4-8 10-11 17-21 29-31 35-39 42- 
45 50-51 56-58 60-61 64 68-69 75 
77 80 82 35 67 92-94 97 100 302- 
104 307-108 112 116-117 119 123 
127-333 136-137 139-141 143-144 
147-154 157 161-163 165-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 234-286 
290 293 296 298-299 301-302 304 
307 311-313 321 325-326 329-33^ 
333 341 344 348-350 352 356 3S8- 
359 362 364-365 368 370-372 374 
376-377 380-382 392 395 398 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
455 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-468 



no 



BNSDOCID: <WO_ 0153312A1J > 



WO 01/53312 



PCT/USO0/3.I2A3 



"Tissue Origin 



RNA 5-ouxce 



Hyscc 
Library Name 



SEQ ID NOS : 



490-491 493 497-505 
S20 522 524 526-525 
544 £47 549 554-556 
567 571-576 57B 582 
S93 598-599 601 604 - 
615-619 621-626 632- 
645-652 655 660-664 
678-679 688 692-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771-773 775- 
788 793 795-796 800 
810-812 814-819 821 
834-838 842-845 848- 
864-865 857 869 871 
BB6-687 889-891 893- 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-956 
964 S69-970 972 976- 
908-990 992-993 99S- 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057- 
1070-1073 1078 1085- 
1089 1092 1094 1097 
1107 1109-1112 1116 
1123-1125 1132-1135 
1143 1146-1147 1149- 
1154 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206- 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257- 
1261 1267-1268 1270 
1281 1283 1287-1299 
1299 1306 1308 1311- 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 3367 1365 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-3431 1433 1437- 
1443 3445-1446 1448- 
1454 1459 1461 1465- 
14/5 1478 1484-1488 
1493 1435 1497-1498 
1509 1512 1518 1521- 
1527-1528 1532-1533 
1541 1547-1550 1552 
1561 1565-1566 1568 
1578-1579 1583 1586- 
1591-3592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624- 
1632 1634-1636 1638- 
1644 1646-1649 1653- 
1664 1666-1667 1670- 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-1768 1778 1780 



510-513 5lfc~ 
534 537-54T 
560 562 564 
586-585 592- 
606 6C6-61? 
634 637-64? 
665-672 67E 
698 702 711 
731 735-73f: 
753 75S 762- 
778 780 78C 
803 805 808 
826 829 832 
655 B57-861 
874 876-883 
896 856-900 
918 920 922 
940-942 945 
960-961 963- 
978 982-986 
997 999-1002 
1013 1016- 
1025-103: 
1044 1047 
1064 1068 
1086 1088- 
1099-1102 
1119 1123 
1140 1142- 
1150 1153- 
1167 1170 
1192 1196- 
1211 1216- 
1227-1230 
1243-1244 
1258 1260- 
1272-1274 
1293-1295 
1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1466 1474- 
1490 1492- 
1506-1507 
1522 1525 
1537 1540- 
1556-1559 
1571 1575 
1587 1589 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



adult Kidney 



Invitrogen 



AKT002 



20-21 37-39 47 52 57 60 65-66 
68-69 80 104 107-108 122 130 133 
136-137 140 142-143 149 169 174 
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BNSDOCID: <WO 015331 2A1J,:- 



WO 01/53312 PO7i:S»0/34263 



Tissue Oricin j R\ 7 A Source 



adul t "Tunc" 



GIBCO 



Hysec 
Library Namt- 



SEQ 3D NCS : 



3 8; 197 227-228 235- 
261-265 26*7 280-28} 



236 
286 



244 
290 



>99 



301 304-205 309 312-313 339 341 
344-345 349 356 370-372 376 382- 
383 307 392 401 414 416 A21 43C 
443 445 449 453-454 472 437-488 
504 506 513 516 519 522 528 536- 
540 546 554 585 587 594 £98 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 73S 743 761 768 
775-777 788 796 804 814 627 837- 
638 049-650 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 534 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 105S 1063 1067-1066 
1073 1085 1099-1102 1107 3110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1155 
1192 1196 1199 1232-1233 124: 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 133C 1344 134S 1351 1355- 
1356 1365 13/8-1379 14C3 1414 
1419 1428-1129 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 160C 1619 
1623 1629 1631 1634 1638 1642 
1652 1660 1664 1667 
1673 1686 1709 1727 



1669- 
174C 



1647 
1670 

1776 

T-B'TT 37-39 44-46 ~S0^Si 56 62- 
63 75 82 88 93 103-104 113 125 
133 140 143 150 152 154 157 162 
171-172 174-175 190-191 396 200 
211 214 219 223-224 227-228 253- 
252 256 265 272 274 280-281 285 
310 332 345 351 362 373 381-382 
394 408-409 431 436 445 454 459 
461 467 469 471 476-477 468 504 
513 527 537-540 544 547-548 554 
564 583 607 616-617 621 623-624 
634 645-646 662-664 670 695 716 
719 743-744 763 766 774 789 803 
811 814 817 831-832 B37 -838 845 
852-853 858-859 861 866 680 887 
901 905 941 954-957 966 971 977 
979 981 987 990 992 996 3003 
1O0S-10O6 1014 1017 1045 1047 
1054 1059 1062 1064 1072 1080 
1086-10e9 1094 1107 1126 1134 
1136-1137 1142 1150 1157 3173 
1190 1200 1208 1220 1243 3272- 
1273 1280 1282 1295 1306 1320 
1331-1332 1353 1374 1375 1303- 
1384 1404 1409 1423 1434 1436 
1442 1474 3478 1494 1509 3522 
1525 1531-1532 1547 1549 1553- 
1554 1571 1598 16C6 1613 1624 
1627-1629 3632 1642 1644 3662 
1S69 1676-1677 1684 1696 1727 
1731 1732 1737-1738 1746-1749 
1786 



lyrr.ph ncce 



Ciontech 



ALN001 



4 24 50-51 82 105 137 153 19e 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 362 421 
430 433 445 451 463-462 475 483- 
482 503 526 529 537-540 546-547 
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BNSOOCID: <WO 0t53312A1_l_> 



WO 01/53312 



PCTA!S?im/3«f>3 



n 


RNA Source 


Hyseq I 


SEQ ID NOS 




i 


Library Name ' 





young iiver 



GIBCO 



adult liver | Ir.vitrogen 



ALV00: 



ALV0Q2 



621 626 649 679 72 9 12^- ,?(, 738 
793 803 831 834-836 835 844 *S7- 
858 866 879 905 913 928 963 5?C 
1005-1006 1012 1C38 10S0 3 I U 
1117 1151 1199 1204 1226 3241- 
1265 1274 1324-1325 133S 125?. 
1374 1377 1440-1442 1447 1504 
1549 1600 1618-1619 163) 1641 
1644 1653 1687-1688 1691-165; 
1741 1771 

l~~e Ti~ 20-21 4 6"so-5i sF'e s~-~g1 

75 79 82 93 97 102-103 108 11C 
116 139 143-144 148-149 171-372 
174 387-189 194-195 198 209 234- 
215 230 250 258 267-269 260-28: 
306 309 342 351 356 35S 362 372 
374 392 394 398 401 407-4C8 410 
414 431 444 455 459 476 470 403 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 6C1-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 70C 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 806 612 
814 841 849-851 871 B74 079 887 
893 E98-90C 902-904 906-9C7 511 
919 922 924 934 953 957 963 965 
970 584 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1C8S 
1093 1099-1102 1120-1112 1316- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 139* 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
13B4 1403 1415 1430-1431 143? 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550 
1552 1557-1559 1565 1583 1587 
1597 3609 1614 1620 1631 1637 
1641 1644 1654-1655 3662 1667 
1669 1684 1691-1692 1702 171: 
1725 1738 1741 1743-1744 17Sf 
1760-1761 1763-1765 176S- 



5-8 17 2G-21 32-33 41 55 50 64 
75 77 86 89 102 108 317 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 40f- 
424 428 430 433-435 454 476 454 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 762 
794 814 820 826 834-837 847 845- 
850 85B 861 874 879 893 898 904 
911 918 921-922 926 946 949 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
3285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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BNSDOCID <WO 0153312A1J_> 



wo 01/5331: 



PCI7VS00/3J263 



Tissue Cricin 



^adult 1 1 ver 



adult ovary 



RNA Source 1 Kysec 

I Librarv No** 



Clontecn 



ALVOO:- 



Invitroofrn 



AOV00: 



1 uso "1^67 1578 i5e: l5T3~iV94 

( 15S7 1601 -1602 1616 
1618-161S 1621 1625 1637 1645 
1647 1G52 1654-1655 1660 US6 
1669-1671 1684 1706 1722 J737- 
1738 1742-1744 1760-1761 1752- 

1765 17 72 1774 

29 67 6 297T663 1119 1536 i 76"6~ 



1 4-18 20-23 29 35-40 42-48 50- " 
51 53-58 61-63 65-66 68-65 73-75 
77-78 80 82 8$ 87 89 97 iCO-101 
103-104 106-108 110 113 1:5 110 
122-124 126 128 133-134 136-140 
142 145-147 149-157 162 166 168- 
170 174 177-173 180 182-186 186- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-265 271- 
272 274 277-281 284-286 2«8 290 
295 299 301-302 304 307 3C9-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-3S2 3 94 397-398 
10O 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 469- 
471 473 476-479 483-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-S20 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 575 581 503 585- 
588 590-591 593 595 597 SSS 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-64/ 
649-652 654-655 6i>7-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 75C-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 78e 790-791 794-796 
800 B03 805 809-811 813-615 8lfc- 
819 821-824 826 826-829 831-832 
837-838 843-850 85^-857 859-864 
867 869 871-872 874-675 87f.-e83 
887-886 890-895 896-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 S53 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 955-997 599- 
1001 1004-1009 1013-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 J041-1047 
10SO-1OE1 1054-1060 1062-1064 
1067-1070 1072-1073 1075-3076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1106 
1112-1117 1119-1120 3123-1127 
1131-1135 1142-1143 1146-114S 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-ll7e 
1180 1183-1185 2190-11S1 1195 
1197-1200 1202 3205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
n268 1270 1275 1276 1280-1283 
1286-1289 1291 1293-1294 1298- 



1J4 



BNSDOCID *WO .0153312A1J_> 



wo 01/5331: 



PCT/l,*S0l)/34263 



Pas sue Origin 



acuTt placenta 



placento 



RNA Source 



Clontech 



lnvi trogen 



Kysec 
Lw:rary Name 



SEC ID NOS . 



APbOOl 



APL002 



129 9 


13 06 


130 8 


1312. 


1 "* 1 7 - 


132 3 




u / / 


1 32 Q - 


1 ~x ^ n 

1 J jU 


1332- 


1333 


13 3 8- 


13 39 


1341 


J J ** J ~ 


1351 


1356 


1 2 5 9 


1361 


13 65- 


13 6 6 


1371- 


1379 


1377 • 


137S 


1383- 


13 84 


1386 


13 89 


13 94 


14 0C 


14 04 


1^16- 


1417 


1422- 


142 7 


1 4 25 - 


1431 


14 3 5- 


1436 


1439- 


14 4 3 


14 45- 


1450 




1454 


1459 


1463 - 


14 64 


1466 


14 68 


1470 


1474- 


14 8 1 


14 84- 


1485 


1^88 


1491 


1493- 


14 94 


14 96 - 


14 98 


1501- 


1504 


1506- 




1 en. 


JL 51 / 


J. t> J. ? 


1521- 


1524 




1 CLOT 


1 53 0 - 


15J1 


1534- 


1536 


l j J 0 * 




1 Z>*t 1 


1 j^b 


1546-1550 


1553 


1 D D_> - 


1 CCD 


15 61- 


1563 


1566- 


1567 


1569 - 


1 c*7 n 
1 0 / u 


1572 


1574- 


1575 


15/8 


1 580 - 


1581 


15 87- 


1588 


1590- 


1591 


1595 


1597- 


1598 


1600- 


1606 


1609 


1611* 


1621 


1623- 


1630 


1634 


1636 


1638 


1641 


1643 


1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


1674 


1676- 


1681 


16B3- 


1690 


1699 


1702- 


1707 


1710- 


1711 


1713- 


-1714 


1716- 


1719 


1723- 


1724 


1726 


•1729 


1731- 


1733 


1735 


1737- 


1738 


1740- 


1741 


1743- 


1744 


1748- 


1751 


1753 


1755- 


-1756 


1760- 


1762 


1765 


1767- 


1768 


1770- 


1771 


1776 


1778 


-1779 


1783- 


1784 


17B6 





5-8 44 -45 90-91 
311 351 414 4 /6 
636 719 755 773 
947 955-9S6 962 
1045 1202 3320 
1713-1714 3743- 



107 
503 
860 
990 
1369 
1744 



: 108 159 178 
545 574 624 
890-891 924 
992 1002 
1628 1686 



14-16 26 29 43 
106 116 135 171 
198 210 216 235 
309 329 334 339 
423 430 434-435 
491 517 522 631 
738 746 769 81B 
858 916 948 953 
1005-1006 1013 
1068 1070 1086 
1160 1277 1285 
1345 1429 1435 
1486 1490 1512 
1592-1593 1602 
1664 1673 1675 
1746 1776 



60-61 
177 
236 
359 
448 
723 
843 
-954 
1033 
1139 
1337 
3438 
3519 
1626 
1722 



79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 S88-989 
1036 1064 
1144-1145 
1320 3343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



adult spleen 



G1BCO 



ASP001 



3 5-8 12 15-16 
44-45 57 60 82 
103 106 10E 117 
147 152-153 155 
178-180 196 198 
215 219 234 253 
272 280-281 290 
325 333 341 349 
387 394 406 414 
448 451 473 481 
505 517 519 530 
554 557 574-576 
611-612 620-621 
652 659 661 667 
700 721 728 730 
746 762 765 774 
810-811 817 E22 
852-853 858 662 



19-21 24 
83 87 89 
119-121 
166 169 
2C1-206 
-254 256 
295 302 
358 372 
431 434 
490-493 
534 536 
582 592 
623 631 
671 673 
732 738 
78C 788 
930 832 
666 874 



25 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 

-436 446 
500 503 

•540 547 
595 604 
632 642 

-675 684 
742-744 

•789 794 
845 848 
879 882 
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BNtSOOCID: <WO 0153312A1J. > 



WO 1)1/53317 



PCT/USOD/34263 



| 1 ii ^c-ue "Origin 



RNA Source 



Hysec 
Library Name 



Genomic DNA 
fcrcrn BAC 63118 



GIBCO 



ATS 001 



Research 
Genetics 
(CJTB BAC 
Library) 



EAC001 



SEQ 3 D NOS : 



884 906- SOSTsii 919~921-923 9?6- 
927 934 947 949 957-958 963 977- 
578 983 990 592-994 996-997 955 
1005-1007 1010 1012 1C31 103C 
1042-1044 1046 1049 1059 1066 
1070 1076 1089-J050 1054 1103 
1105 1112 1115 1224 1140 1163 
1170 1174 1177 1190 1196 1215- 
1220 1226-1227 1229 1236 1241 
1246 1256 1269 1271 1274 129E 
1301 1320 1322 1330 1334-1331- 
1339 1349 1351 1353 1359-1360 
1364 1369 1374 1306 1397 1413 
1417 1434 1436-3437 1439 1466 
1474 1477 1480 1485-1487 1498 
1512 1522 1525 1544-1549 1553 
1560 1567 1591 3600 1631 1636 
1651 1654-3655 3658 1662 167C 
1674 1676-1679 1664 1686 1700 
1727 1733 1738 1740-1741 1760- 
1761 1774 1779 1781-1782 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 483-482 493 499 502- 
503 S13 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 788-789 
802 804 809 811 814 826 832 837 
843 845 848 8S9 666 869 877 905 
913 916 919 921 S26 929 937 950 
960 963 971 975 577 981 990 952- 
993 1007 1016 2029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 3072-1073 1007 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1361-1162 1175 1206 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1365 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1455 1484 I486 1490 1453 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 3567 1569 157" 
1577 1586 1591 15S9 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1662-3662 2666-1667 1670 
1675 1684 2690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 
2767 1779 

686 1352 1412 " 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITE BAC 
Library) 



BAC002 



1411-1412 



1J6 



BNSDOCID: <WO. 



01S3312A1_I_> 



W O 01/53312 



PCT/U$»»/347/>3 



r Tissue Cragxn 

r Genomic DNA 
I from BAC 35316 



"adult bladder 



bone marrow 



RNA Source 

hcsc Tc h 
Genec ici 

(CITE BAC 
Library; 



Hyr.eq 
Library Nfitne 



SEQ ID NOS: 



BAC003 



lnva troQcri 



BIDOOl 



CTontecn 



BKD0O1 



| 135: 



S-8 17 16 22-23 33 37-39 56-5'. 
80 S3 100 120-121 169 201 23*7 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 Sfe<; 6C7 616-617 626 635 
652 667 67a 710 727 755-756 761 
773 786 789 837 840 866 893 69t 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1289 1199 1270 1369 1483 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1656 
1669 1671 2690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



3-8 11 13 18 29-31 33 35-36 4C 
43-45 47-48 50-51 57 6C 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 378-180 
187 192^193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 ^42-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-3A5 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 £30 535- 
540 542 54^-545 549 555 565 567 
569-577 561 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 636- 
660 666 fclO 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-776 780 785-786 769-791 
796 798 802 810-812 823-824 826 
630 832-833 837-838 843-844 846- 
e55 e58-859 866-867 869 878-88C 
883 890-892 896 9C3 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 9S5-958 963 969 973 
976 981 985 987 990 992 995 1 00O 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
120C 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1306 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 



BNSDOOD: <WO 01S3312A1 J_> 



WO 01/53312 



PCT/USOOW263 



Tissue Origan rtNA source 



bone morrow CI on tech 



Kyseo 
Library Name 



SKQ ID NOS : 



I 1506 

I is?e 

I 154 6 
' 15S7- 
| 1592 
j 1626- 
j 1638- 
1653- 
1684 
1713- 
1727 
1772 



bmdoo; 



1509 
1528 
1548- 
1559 
1597- 
1626 
1639 
1655 
1686 
1714 
1737- 
1781- 



152 3 

1531 

1549 

3572- 

1600 

1630- 

1641 

1661- 

1690 

1717 

1738 

1702 



152T- 

1536- 

1552 

1572 

1609 

1632 

1646- 

1662 

1702 

1720 

1740 

1785- 



T522" 

1537 

1554- 

1581 

1614 

1634 

1 647 

1676- 

1707 

1722- 

1758 

1736 



1524 

2 54 3 

155^ 

3589- 

162: 

1636 

165: 

3681 
171 • 
17 2C' 
176" 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 1 37 
139 169-170 174 177 i80 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 S16 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 S97-596 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 726- 
719 728 734 737-738 742 761 765 
774-778 790 ell 814 818 830 834- 
836 854-855 859 866 869 871 878- 
879 8B4 889 892 904 922-923 932 
990 952 998 1001 1004 2016 1036 
1 042 1048 1051 1054-1055 105P 
1OB8-1089 1106 1112-1114 1155 
1157 3192 1200 2223 1227-i22f 
1236-3237 1260-1261 1282-1263 
1285 1287 1295 1334 1317-1323 
1324-3327 1330 1333 3341 134? 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 3379 1387: 
1383-3384 1394 1397 1400 1406 
1413 1417 1425-1427 3438 144^ 
1446 1459-1460 1470 1493 1501 
1521 1536 1546-1549 1560 1573- 
1574 1578 1598-1600 1621 162C 
3631 1634 1646 1649 3653 1656 
3658 1669-1670 1683-3684 1687 
3 688 3690 3693 1696 2699 170; 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 17S2 1/55 
1760-1761 1767 1777 1781 -178", 
1786 



bene marrow i 
bone marrow \~ 



Clontech 



73-74 503 922 1036 1711 



Clontech 



EMD007 



95-96 866 1320 1475 



adult colon j Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 388-189 201 204-206 210 
218-221 22S-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 401 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 553 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-2055 1063 1066- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 3320 
3345 1353 1355 1369 1428 1439 
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BNSDOCID: <WO. . 0153312A1_I,> 



wo tn/f>xn: 



l»CT/l:S««/34263 



Tissue Orion. 



Mixture of 16 
tissues - 
mRNAs 



"Mixture cf 16 
tissues - 
tnRNAs* 
sdul t c ervix 



Various 
Venders 



Various 
Vender c- 



Hysec 
Library Nam* 



SEQ ID NOS : 



cTboir 



CTL021 



rvxoo: 



i 146^-1464 isi2 ibS6 isfc:* isr; 

■ 1594 1596 1614 1625-1626 1631 

! 3639 1645 1650 1675- 1677 1667- 

| 1686 1701 1713-1714 1724 174C- 

' 3 76 5 

' .'"4 0 3 14~9lTl 6 86 



I 312 782 1132-3133 1403 1712 1715 



33 



1 4-8 11 13 18-21 25-26 30- 31 
37-39 43 46-47 £6 61 64-66 71 
73-74 62 85 94 100 103-104 113 
118 122 126 130 134 140 347 153- 
156 163 170 179 181 186 192 195- 
1S6 198 20L-202 218-219 222 225- 
231 257 266 276-277 285-286 288 
296 301-302 304 307 312-31C 324 
326 329-330 332 335 342 352 3Sfc 
362 371-372 376 379 381-382 384 
386 398 400 410 414 416 42S-420 
426-427 430-431 433-436 43? 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
S47 S57 561 572-573 575-577 581- 
S82 585-566 588-589 593-594 600 
602 604-6CS 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 69S 
708-709 711 713 720-721 727 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 83l 
e32 834-836 843 847-848 851 
S57-860 864-866 86 9 871 876 
380 8e2 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
S27 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 100( 
1005-1007 1016-1017 1024 102: 
1033 1036 1038 1045 1047 105>- 
1056 1066-1067 1071 1073 1 075- 
3D79 1082 1098 1113 1124 112!- 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
I20O 1202 1211 1234 2216 122j- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 126f- 
i270 1279 1287-1290 1308 131C- 
1311 1316 1320 1323 1327 134S 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



706 
729 



855 
878- 



* The 16 tissue- inRNAs and their veDdor source, are as follow*: I) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Inviirogcn), 3) normal aduJt liver 
mRNA (Invitrogen), 4) normal fetal brain roRNA (Invitrogen), 5) normal fetal kidney 
mRNA (brvhrofen), 6) normal fetal livei mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphoblastic rnRNA (Clonrcch), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human concepiional umbilical cord mRJ^A (Bu -Chain). 
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BNSDOCJD- <WC . _ 01S3312A1J_> 



wo 01/5331: 



PCT/DSOO/34263 



Tissue" Or join I RNA Source 



aiaphragn 



BioCham 



endothelial | Strategene 
cell5 j 



Kysec 
Library Name 



"SEQ ID NOS 



diaoo: 



ectoo: 



1406 

1437 

K6S 

1S03 

1531 

1585 

1609 

1626 

1649 

1674- 

1702 

1724 

1741 

1760- 

1786 



1416 

1442 

14 72 

1506 

1533 

1589 

1614- 

1628 

1653 

1675 

17C9- 

1729 

1743- 

1762 



1446 

1478 

1512 

1541 

1597- 

1616 

1630 

16S6 

1683 

1710 

1731- 

1744 

1767 



1427 

1448 

1462 

1522 

1547 

1598 

1620 

1638 

1662 

1685- 

1715 

1732 

1748- 

1773 



1431 

1453 

1496 

1527- 

1569 

1600 

1623- 

1641 

1667 

1688 

1717 

1735- 

1749 

177 8 



1436- 

145? 

150D - 

lS2fr 

1573 

1608- 

1624 

1643 

166? 

169? 

1722 

1735 

1755 

1785- 



137 2 
1478 



82 289 730 780 
1S99 1634 



986 1409 



3 5-10 13 15-21 24 
39 42 44-45 50-51 5 
60-61 65-66 68-69 7 
82-83 85 87 89 93-9 
110 112-114 116 118 
133-134 137-142 147 
161-163 166-172 176 
1S2 194 196-201 204 
214 220 224 229-230 
240-241 251-252 258 
267-269 272 276-277 
285 288 290 295-296 
311 313 316 321 325 
335 340-342 351-355 
380-382 364 387 390 
407-408 410 412 414 
431 434-436 439 444 
463-464 472-475 477 
490 497-498 500-504 
519 522 524 526-528 
540 542-546 548 S61 
572-576 579 581 585 
595 597 599 603 607 
620 622 626 630 632 
644 647 656-660 662 
678 680-682 692-697 
712-713 715 730 732 
743-746 751 759 768 
778- 783 786-789 793 
B07 810-B11 814 B16 
824 826 828-829 832 
845 848-850 854-860 
871 874 876-879 883 
891 894-895 898-900 
913 916 919-922 924 
935 939 943 948-949 
959-961 964 969-970 
983-984 988-990 992 
1000 1002 1004-1013 
1022-1025 1028 1031 
1038-1046 1050 1055 
1060 1062-1064 1067 
1074 1076 1078 1082 
1089-1090 1093-1097 
1107 1109-1113 1116 
1126 1128-1131 1134 
1140 1144-1145 1148 
1157 1360 1163 1173 
1198-1199 1202 1205 
1216-1217 1219 1221 
1232-1235 1238-1241 
1246 1250 1253 12S7 



26 29 34 37- 
3-55 57-5P 
3-74 77-78 e0 
6 101-105 106 
-122 124 128 
-150 152-1S3 
-179 187 190 
207 210 212- 
233 235-236 
261-262 265 
279-281 284- 
301-302 310- 
329 331-333 
360 371 375 
392 397 400 
416 425-427 
445 449 454 
-479 486 468- 
610-513 516- 
532-534 536- 
-563 566-567 
-586 589 593 
612 615-617 
634 638-641 
-664 670 673 
707 709-710 
734 736 736 
771 773 775- 
800 8C3 805- 
818 821-822 
834-838 842- 
862 864 869 
885 887 890- 
903 90S 910- 
926-928 930- 
951-954 957 
973 975-978 
-993 996-997 
1016-1020 
1033-1034 
-1056 3059- 
-1070 1072- 
1086-1087 
1099-1103 
1117 1124- 
3135 1138 
1149 1153 
3183-1184 
1207 1211 
122S 1229 
1243-1244 
1258 1261 
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BNSDOOD <WO 01S3312A1 J_> 



WO 01/53312 



0711800/34263 



Tissue Oria^n \ RNA Source 



ia^n ~y 



Genomic clone i~ 
from the short 
arm ot 
chromosome 8 



ccophaguc 
fetal brajn 



""fetal brain 



fetal brain 



Genomic DNA 
from 
Genet ic 
Research 



Bi oChaan 



Ci ontech 



CI ont cch 



CI ontech 



Library Name 



s:-:o id no: 



efmoo: 



f.fooo; 

"FDR OOf " 



FDK0 04 



KP.KOOfc 



1265 
1277 
1290 
1217 
1330 
1345 
1367 
1400 
1424 
1440 
1468 
1491 

isi: 

1531 
1547 
1561 
1579 
1592 
1615 
1631 
1650 
1669 
1696 
1719 
1736 
1755 
1771 



1266 1266 
1280-1263 
1293 12~5 
-1320 132-4 
1234-1330 
-1347 1350 
1369 :374 
1406 14C8 
-1426 1428 
-1442 1448 
1472 1474 
1493 15C1 
1516 152C 
1536-1537 
1549 1SS2 
-1565 1568 
1581-1583 
1597 1605 
1618-1CZ1 
1634 1636 
1652-1659 
1671 1675 
-1698 1703 
1722-1723 
1739-1741 
1760-1761 
-1773 1776 



286 
1411 



686 1297 1 
1412 17S4 



12 70- 1 27} 1274- 
12B5-:2&6 1286- 
1296 3306 1312 
-1325 3 32 7 13?fr- 
1338 1342-134T- 
135503S6 1359 
1376 "379 1398 
1414 1417 1419 
1431 1434-1436 
1450 J462-1466 
1476 2487-146C 
1504 1506 1509 
-1521 1526 152S 
1539-1540 1546- 
1555 3 557-1555 
1571 1575 1578- 
1587-] 588 159C; 
1606 1611 1613 
1628 1630- 
•>641 164 3- 
1666-1667 
] 683-1686 
1715-1716 
1 731-1733 
1743-1744 1749 
1765 1767-1768 
1779 1783-1786 
303~1304" TT52 



1624 - 

1638 

1664 

1681 

1711 

1726 



131-132 261 
1000 1007 135 



89 380 

7 



b03 860 092 



62-63 89 112 
379 391 411 4 
710 867 1012 
1320 1407 164 
1732 1746 176 



68-69 90-91 1 
362 374 403 4 
668 670 691 7 
1209 1216 123 
1387 1410 141 
1547 1593 



126 194 
81 546 5 
i031 10: 
3 16 52 Z 

39~~2T2~ 
36 611 
85 805 
2-1233 
6 1430 



322 336-338 
63 607 675 
5 1251 1262 
686 1731- 



213 301 331 
6 4 5-646 659 
845 116? 
1 238-1239 
1496 1536 



5-5 25 43 60 
QO 87 92 101 
149 152-153 
207-208 210 
238 251-253 
3O1-302 307 
330 333-334 
357 370 373 
391-392 397 
411 417 421 
437 440-443 
476 483 488- 
513 516 51S- 
544 547 550 
590-591 595 
623 628-629 
657-658 660 
689 691-694 
710 716 720 
744 757-76C 
806-807 810 
856 861 864 
894-895 898 
936 938 945 
959 961 963 



62-63 65-66 70 72 
j03 106 114 136 139 
257 168 171-172 175 
212-213 221-226 237- 
266 272 279-281 295 
310 317-3ie 321-324 
236-338 346-347 352 
377 379-380 382 384 
399 402 406-408 410- 
424 426-427 430 436- 
454 460 464 467 473 
489 495 497 508 510- 
520 524 530 537-540 
561 567 572-574 582 
597 604 607-609 615 
631 634 636-640 655 
665 669 674-675 679 
696-697 699 701 706 
728 732 734 736 742- 
763 775-778 780 799 
817-818 826 839 843 
871-872 884 890-891 
904 915 921-923 935- 
550 952 955-956 958- 
967 969-971 990 992 
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BNSDOCID *WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/3-I263 



T\ fisu^ Origin 



fetal brain 
f e t al brain 



RNA Source 



Hysec 
Librarv Name 



Cl on tech 



FBks03 



lr.vi t rogen 



FBI 002 



Invitrogen 
Ciontech 



SEC JV NOS • 



"1 



999 1001 1005-1006 1008 1013 
1016 1022 1024 1029-1030 1032 

1035 1042 2047-1048 1052 10S6 

1065 1067 1070 1082 1089 110? 

1114-1115 1119 1131 1143-1145 

1151 1153-1156 1160 1163 1167 

1172-1373 1178 1184 1186 1188 

11S0-1200 1211 1216 1222-1223 

1226-1227 1229 1231 1236 1245 

1253-1255 1258 1260 1262 1266 

1270-1273 1281 1287 1308-1309 

1314 1317-1320 1326 1334-1335 

1339 1341 1344 1350 1356 1369- 

1371 1373 1276 1379 1381-1382 

1386 1392 1396-1398 1419 1423 

1425-1426 1428-1429 1432 1437 

144C-1441 1448 1466 1470 1482 

1502-1503 1S07 1511 1513 1516 
1519 1536 1544 1549-1550 1557- 

1559 1573 1589-1590 1598 1608 

1611-1614 3639 1621 1625-1626 

1640 1651 1657-1658 1676-1679 

1693 1696 1703-1704 1713-1714 

1718 1720 1722 1724 1726 1728 

1730-1733 1735-1736 1738-1739 
1742 1745 1755 1759-1761 1765 

1767 1771-1772 1777 1779-1780 
1786 



235-236 520 864 1066 1188 1587" 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-1S6 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 46l 464-466 4 83 490 
494 509 516 519 522 527 557 563 - 
562 572-573 550-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 625 
829 840-841 847 854-8S5 857-858 
897-900 904 919 925 935-937 946 
948 949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 111B 1120*1128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1334 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 
fetal kidney 



FKR001 



105 124 160 289 864 1036 1148 
1229 1614 1616 1762 1785 



FKD001 



5-6 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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BNSDOCID: <WO 0153312A1 J_ > 



WO 01/5331? 



PCJ7US00/34263 



J Tissue Origin 



f ttal kidney 
T^tal kidney 



tetailxsng 



fetal lung 



fetal lung 



fetal livex- 
spleen 



RNA Source " r Hysec 

} library Name 



Clontech FKCQ02 
InviTtogen I FKDOO^ 



cTontech 



FLGOO I 



lnvitroaen 



?LG0Q3 



Clontech 



Col umbic 
University 



FLG004" 



FLS003 



l JEQ ID NOS : 



"~1 



258 277 280-281 307 310 314~320~ 
371 337 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
S63 572-S73 565 600 619 623 650 
654 657-658 6&0 679 715 731 7b0 
798 821 E33 844 854-B55 857 864 
868 878 911 S29 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1332 1331 1355 1365 
1371 1376 1391 1422 1425-1426 
1440-K41 1470 1543 1598 1601 
1618 1631 16S1 1654-1655 1669 
1678-167S 3692-1692 1733 17e5 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



35-36 94 323 371 398 426-427 445 
473 549 560 804 616-617 626 631 
649 651 719 746 786-787 832 e42 
B49-850 864 894-895 1075 117e 
1182 1200 1206 1309 1311 1345 
1429 1453 1567 1576 1620 1686 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 2S4 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 426-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 72B 
761 766-767 774 805 830 652-853 
864 875 923 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 13S5 1369 13B1 1423- 
1414 1431 1438 1449 1491 1512 
1536 1547 3557-1560 1567 1590 
1602 1636 1644 1653-1665 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



103 276 334 
1614 1656 



465-466 737 843 1131 



3-11 13 15- 
Sl 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233*236 240 
255-256 258 
274 276-278 
293 295 299 
311 334 316 
332 342 344 
358 360 362 
386-387 390 
406 408 410 
437 439-442 
456 459 461 
487-488 490 
506 509-513 
529 531 534 
553-554 561 
576 579 581 



21 23 30-39 
60-66 68-69 
85 87 89 92 
124 126-127 
144 147-149 
167-172 174 
190 193-194 
210-214 219 
-244 246-247 
261-265 268 
280-281 284 
-201 304 306 
318 320-321 
-345 350 352 
370-374 376 
3*2-393 400 
-412 415 417 
444-445 448 
-470 472-479 
-491 493 
515-520 522 
536-540 542 
562 564 567 
583 585-597 



41-48 5U- 

72 75 
-103 105- 
130 233 
152-153 
176-178 
196 298- 
221-231 
250-251 
-269 272 
-286 288 
-307 309 

326 329- 
-353 356- 
378-384 
401 403 
419 422- 
452-454 
481-483 
-501 503- 
-524 526- 

547-549 
-568 571- 
599-605 
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BNSDOCID:<WO__0153312A1J. : 



WO 01/53312 



PCT/USOO/34263 



[ tissue Origin 



fetal liver- 
spleen 



RNA Source 



Hyseq 
Library Name 



SEQ Id NOS: 



Columbia 
Universa ty 



FLS002 



"607 610-623 615-£2j 623-624 62£ 
628-624 636-640 644 647-650 655- 
660 66 4 j 665-670 672 674-675 6 76 
681-662 684 690-695 697 702 706- 
710 713-724 716-729 725-728 730- 
731 734 736 73B 740-741 743-746 
748 7S0-7S2 759-766 768 772 774- 
777 775 783-788 793 796 798 800- 
805 800 820-612 814 810-619 821- 
624 826-e32 634-837 843-847 849- 
867 865-676 878-863 887 889-895 
697-896 902 904-914 916 919 921- 
928 930-937 S39 945-950 953-958 
960-9S1 963-965 967 969 911 974- 
978 96C-983 986 988-990 992-993 
995-957 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1026- 
1031 1033 1035-1036 1039-1044 
1047 1049-2050 1053-1056 105b- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
108S-1090 1097 1099-1103 1107- 
2123 2115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 :195-1200 1202 1206 1708- 
1211 1214 1216 1218 1221-1222 
1225 1227 2234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1260 1270-1273 1277-1282 
2284-2285 1287-2250 2294 1299- 
1300 1306-1308 1313-1320 1324 - 
1325 2327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-135C 
1353-2360 1362-2363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 2383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-2437 1439-1442 1445-2446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 2509 2511-1512 2526- 
1519 1524-1526 1529 1532 1536- 
1541 2546-1547 1545-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-2588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-2618 1620-1622 1624-2625 
1627-1628 1630-1632 1634-1635 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 17S4 1760-1765 1767-1773 
1780 1783-1786 



3-11 13 "15-21 26 29 32 35-39 42 
44-45 48 50-51 54-5S 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 203 105 107-108 110 112-113 
116-119 122-125 128 130 237-239 
145 347-153 155 157 159 161-163 
156 268 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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BNSOOCID: <WO 0153312A1J. s 



WO 01/5331? 



PCT/USOO/34263 



Tissue Orjoin I" RNA Source 



Hysec 
Libr&ry Name 



SEQ ID NOS 







206 212-215 219-221 223 22S-~22lT 
231-232 240-244 24£-247 250-25: 
258-259 262 264 266-265 272 275 
j 277 280-281 284 266 288 290-292 
295 298-299 301-304 306 308-310 
338 320-321 523 325 329 331 334 
342 348-349 352-353 3S6 359 368 
371 374 376-379 362-384 366-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 425-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 S0O-S01 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 686 690 696 
698 700-7C3 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 755 
763-766 768 770 773-777 780 782 
784 7B6 791 795-798 803-802 805 
808 811-812 818 823-824 826-827 
632 834-837 839 843 846 848-856 
358-861 865 867 869 871 873-874 
976 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 9S0 953 956 
961 96S-967 971 973-975 977-979 
981 984-985 990 992-993 995-957 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-3042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 11C5-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-3145 1146-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1206 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-3261 
1264 1268 1270-1271 1275 1275- 
1279 1284-1286 1286-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 140C 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
3478 1482 1486 3490-1493 1496 
1498 15C0-15O4 1506 1508-1505 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1S64 1567- 
1569 1580 1587-1588 1591-1592 



BNSDOCID: <WO __ 01 W312A1_L> 



WO 01/5331? 



PCT/l!SiM»/34263 



Tissue Or lean R>JA Source 



fetai "liver - T Columbia 
spletr. | Universitiy 



fer.ci liver J Invitroger; 



Hysef 
Librr.rv Name 



SEC ID NOS 



flsoo: 



flvoo: 



1597- 


iT9T" 


1600- 


1601 


1611- 


i6i;. 


3616- 


1628 


1630- 


1631 


1635- 


1638 


1641 


1646- 


1649 


1652 


1654- 


1655 


1661- 


1662 


1664 


1667 


-1669 


1674 


1676- 


167S 


1633- 


1 684 


1666- 


168t 


1691- 


1692 


1699 


1702 


1707 


1713 


1713- 


1714 


1717 


17J9 


1722 


1726- 


1727 


1730- 


1733 


1738 


1740 


1743- 


1744 


1748- 


1752 


1758 


1760- 


1761 


1763- 


1764 


1767 


1769 


1772- 


17 71* 


1776 


1779 


1783- 


176 6 







103 300 318 321 352 372 379 381 

384 392-393 403 422 424 429 434- 

435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 

1357 1369 1378 1418 1424 1622 

1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



15-16 26 24 58 61 64 70 75 7e 85 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 20C 
204-206 210-2J1 S20 225-226 23C 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 3S6 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 S22 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 017 829 03 7 857 861 672- 
873 875 881 889 894-895 909 911 
916 954 962 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1O0R 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1115 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425 
1476 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1544 1649 1666 1G74 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



fetal liver 



Clontech 
Clontech 



FLVOO > 



676 998 1715 



FLV000 



93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal muscle 



Invitrogen 



KMS003 



26 37^39 50-51 58 84 86 89 98 
113 128 131-132 139 155 172 186 
194 196 201 206 211 230-231 256 
261 276 282 286 302 325 359 361 
375 379 383 398 412-413 419 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 786 
826 837 860 874 Sl3 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 
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BNSOOCID: <WO 01533l2AlJ-> 



WO 01/5331? 



PCT/USUO/34263 



Tissue Ornjin | RNA Source 



fetal inur. ci e 



Invi t rogen 



Invitrogen 



Librarv Name 



5EQ 3D NOS 



fksoo: 



FSKOC;- 



1099-11C2 1116-1117 1121 1164 

1173 J198 1208 1228 1240 1256 
1266 1270 1277 1298 1317-1320 

1324- 1225 1329 1336-133-7 136? 

1383-1364 1399-1400 1403 1409 

1433 1505 1514 1542 1SE1 1554 

1557-1559 1562 1589 159$ 1620 

1632 J644 1S50 1652 1671 1675 

1712 ]725-1726 1743-1744 1754 
1766 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-:441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 116-119 
123 253 135-137 139 144 146 146 
151-153 156 163 170 176 1B0 188- 
185 197-198 200 202-203 210 216 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 33S 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-3e2 384 
388 394 404-405 4C8-405 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 52G 531 537-540 547 549 56C- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 66S 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 708- 
789 798 809 Oil 814 816-817 822 
824-626 831 842 857 859 861 863- 
864 881 694-895 908 910-911 916 
918 922-523 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 575 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 107? 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
3153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1206 12111212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1353 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 3475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1S29 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 160e 1611 1614 1618 1624- 
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BNSDOCtD: 'WO 01S33l2A1J_: 



WO 01/5331? 



PCT7US00/342W 



Origin | RNA Source i Hyseq 

i lAbrajv Name 



fetal skin 



fetal FpTeen 
umbiTj col cor3 



lnvitroaen FSK002 



BioChain 



FSPOOl 



FUC001 



SIQ ID NOS. 



1626 1632 1634 1636 3641 1643- 

1644 164€ 165':-16S7 1660-1662 

1665 166* ICwS 1685 1687-1689 

1702-3703 1VC9-1710 3716 171? 

1724 1727 3731-1732 1737-1740 

1742 174-J 1749 1755 1760-1761 

1765 1772 1776-1777 17791780 
178S 



13 286 302 3C7 313 
339 341 354 370 372 
408 414 426-',77 433 
515 544 585 598 767 
1076 1105 1155 1317 
1333-1335 1343 1347 
1371 1377-1376 1391 
1466 1647 1656 1678 
1688 1693 1716 1721 
1732 1739 175'. 



321 330 33if 
385 400 402 
436 450 454 
810 845 935 
1320 1326 
1350 136S 
1397 1422 

-1679 1687- 
1725 1731- 



110 137 211 353 599 927 1108 
1639 1771 



4-8 10 12 14 17 33-36 44-46 5*J 
64 6B-6S 75 e2 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 1S2 197-398 200-202 212- 
215 230 234 216-247 251 256 263 
267 271-272 28C-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-38C 386 39C 
392 394 406 406-410 412 414 416 
420 424 427 43C-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 £55 561 574-577 586- 
591 593 606 635 620-621 632 637 
645-647 650 65S--C60 662-664 667- 
668 674-67S 664 687 696 698 701 
703-705 709 712 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-751 793 796 802-803 
814-817 £22 833 643 845 848 858 
861 864 875 679 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 9^8 953 960 966 977 
984 990 952 99B 1000- 1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 3D6i-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 117i 1197 1204-3205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454 1455 1473 1482 
1484-1485 1469 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 3567 1571 1573 1575- 
1576 I57fc-1579 1591 1595 1600- 
1601 1606 1612 1615 1621 1624 
1626 1636-1637 1647-164B 1651 
1653 1656 165e 1661-1662 1672 
1675 1682 1684 1686-1688 16S0 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1766 



fecal bxain 



GIBCC 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-41 50-51 54-55 58 60-61 65-66 
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BNSDOCID: <WO 0153312A1_I_> 



WO 1)1/53312 



PCT/USOO/34263 



Ta ssue Oracin 



RNA source 



Hysec I 
Library Name ! 



72 7 5 77 60 82 £5 90-Sl 94 1~0C^~" 
102 107 110 li;.-116 110-119 122- 
123 126 128 134 236-140 147-14F 
153-155 157 161 165 165-172 175 
181 186 188-18S 197-198 204-206 
208 210 215 222-223 225-226 230 
235-238 240-24: 247 253 256-25f 
260-262 267-269 276 279-281 284 
286 289 298 30C-302 307 310 316 
321-323 325 330-331 339 341 346- 

349 352 354 356-359 362 364-365 
371-372 377 375-380 382 384 387 

350 400 408 414-416 419 424 431 
434-435 438 44j-443 449 451 453- 
455 457-463 47C 472-473 47$ 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 505- 
512 516 519-520 522 525-526 525- 
530 537-540 543-544 546-547 566- 
567 569-570 572-582 585 588 590- 
S91 593 595 595 601 604 606-609 
611-612 614-620 622-624 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
G81 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 75C-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-80S 81D-811 814 819-821 
824 826 830 834-837 839-850 854- 
856 858-860 862 864 869 871 076- 
877 879 863 886-887 890-891 893- 
095 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-560 963-564 967 969- 
972 975 978-975 981 983 986-987 
990 992 995 997 999-1002 1005- 
100S 1011-1013 1016 1018-1019 
1023 1026 1029-1031 1033-1035 
103e 1041 1047 1050 1053 1057 
1059 1064 1068 1070 1072-1073 
1078-1079 1081-1082 1086 1089 
1094 1097 1103 1107-1109 1113- 
111S 1121-1122 1127 1134-1135 
1138 1140 1143 1148-1151 1153 
1156-1157 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 121G 1219-1220 1226-1227 
1229 1232-1234 1:240-1241 1243 
1246 1249-1251 1253-1254*1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 1327 1338- 
1339 1341-1344 1346 1349 1355- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 3430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 15C1-1503 1507 15C9 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1537-1536 
1547 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1615- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 164e- 
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BNSDOCID: <WO.._0153312A1_I_ > 



\\ O 01/53312 



PCT/USOO/3426.' 



ijcfuc Origin 



RNA Source . Hysec" 

| Library Ncime 



macrophage 



intant brain 



Invitiogen 



HMP001 



Coi.unr.6i a 
University 



IB20O2 



SSQ 3D NOS: 



1645 2 6 i 3 1653-1655 1657-16S£ 

1664 26Ce 1667 1669 1673 167H- 
167? l663-:664 1686 1693 1703 

X704-17CE 1709 1713-1714 1717- 

1720 1727-1728 1731-1733 

1737-1731 1743-1744 1752 17S4- 

1755 17£' -760-1761 1765 177; 
1775 176 



5-8 1-0 >04-20S 503 634 67B 855 
87B 533 <)?8-9&9 3379 1448 1S04 



10 11 
37-35 



112 -! j? 
134-136 



1- 3S-3 8~ 22~- 2 3 25 29 34 
43 47 50-53 54-56 58 60-63 
65- 66 61-65 72-74 eO 82-83 86 
86-92 57 300 102-304 106-108 130 
135-116 336 123 328 330 
•jS-139 143 147-349 3 53- 
152 ISO- : S 5 163 365-167 169 372- 
175 ie;-:£4 286 1S3-196 398 203 
203-2C5 209-210 234 -21S 222 224- 
226 231-232 235-236 239 246-247 
252 25' 260 268-269 272 276-277 
279-281 266 288 2S3-292 295 29H 
300-301 304 307 330 313 323-323 
330>331 -33-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 363*364 392 3S7 401 406 406 
411 433-414 436 436-439 422 428 
430-431 434-435 436 443 449 453- 
454 461 46-J-466 465-470 472-473 
475-476 478 482-483 487 490 492 
494 45-7 5C3 507-506 510-513 516 
519-520 524-526 530-534 536-540 
547 £50-551 561 563-564 566-567 
572-576 579 581-582 584-507 590- 
591 £S? ^55-597 607-609 611-633 
616-637 620 622-624 627 631 637 
643 641-647 650-655 657-658 660- 
665 667-675 689 693 695 697 699 
703 707 713-735 717 721 72B-731 
733-736 735 743 745 75l 755 759 
763 765-770 772 77e 780-781 765 
78B-7P9 753-794 799 803 608 813 
814 823-826 e30 834-836 840-843 
845 €46-650 854-855 860 662 864- 
865 870 872 875-876 878 886 888 
890-853 e^4-896 898 903-904 916- 
917 £15 572-925 92~*S2B 930-932 
934-S36 538 943 945-946 948-9S0 
953-554 559-962 966-969 977 979 
981 986-9S0 992 997 999-3000 
1004-1006 1014 1016 1018-1015 
1024-3025 1033 1036 1047 1053 - 
3052 3054*1055 3057-3059 1063- 
3064 3068*1070 1073 1083-1082 
1085 ice? 1108-1113 1116-112C 
1123-3324 1330 1132 1138 1140 
1149 115: 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1186 
1190 3393-1194 1196-1197 1195 
1204 1206-1209 3211 1218-1222 
1226-1227 1229 1231 1234 1243 
1247 1249 1251 1256 1258 1263- 
1262 126S 1274 1279 3282 1283 
1285 1267-1289 1294-1255 1305 
1307 1333-1314 3316-1320 1325 
1332 1343-1342 1345 1349 1356 
3362-3363 136S-1366 1368-1370 
3374 3383 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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WO 01/53312 



PCr/USOO/34263 



Tissue Onca.n 



infant brain 



KKA" Source 



Hysec 
Library Name 



£EQ ID NOS: 



Columbia 
Universi ty 



infant brain' 



IB2003 



1423 


1429- 


1431 


342S- 


1436 


1 4 3 S - 


1441 


144 3 


1447- 


1449 


1451- 


1452 




145. S 


1457 


1459 


1463 


1465 


1468 


1470- 


14 71 


14 75 


1479 


1482- 


1483 


14 65 


1193- 


14 S4 


1496 


149C- 


1499 


• SC2- 


1503 


1505- 


1507 


1309 


1522- 


1522 


1525 


1528 


1531 - 


1533 


1542 


2546- 


1547 


1549- 


1550 


1554- 


1555 


156? 


1565- 


1567 


1569 


1575 


1580 


1SH?- 


15B6 


1588 


1590 


1592- 


1593 


2595 


1598 


3600- 


1601 


1606- 


1610 


1612 


1614-1616 


1619 


1621 


1624 


1626- 


1627 


1630- 


1633 


1637 


1639- 


164 0 


• 64*2 


1644 


1647 


1652 


2654- 


1655 


3 658- 


1659 


1664- 


1665 


1672- 


1673 


1676- 


•1681 


1685- 


1686 


1693- 


1695 


1701- 


17 02 


1704 


1708 


1717- 


1720 


3723- 


1724 


1726- 


1728 


1733 


1735- 


1741 


1743 


-174 4 


1752 


1755- 


1758 


1762 


1765 


1771 


1174 


1777- 


1778 


1786 









17-18 20-~23" 2 9 34 43 60 €8-69 
70-00 88 J00-1O1 107 110 112 118 
123 328 133 135-137 146 148 152 
159 166 16? 274 194 198 203 215 
223 225-226 229 235-236247 260 
276-282 2BL 290-252 295 300-303 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 40f-409 414-415 453-4SS 
472 476 47f-479 490 503 SC7 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 S95-596 
601 606 61V 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
67S 670 60S 715 717 727-726 73C 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 862 889 694-895 898 904 917 
929 921-923 532 935-936 946 95C 
954 962 977 979 997 999-100C 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1267 
1170 1174 1193-1194 1196 1199 
1202 2206 1209 2220-3221 1226 
1229 1240-1241 1251 1258 1284 
3288-2289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 2440-1443 
2446-2447 3457 2459 1471 2499 
1503 1507 1509 1535 1546 1557- 
1559 1567 1572 1587 1595 159S 
2610-1622 3S15 1631 3639 1644 
1647 1657-1658 1673 1678-1681 
2683-1684 1701-1702 1708-2709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



Columbia 
University 



1BM002 



101 1713 139 152 260 279 290-292 
374 377 552 563 608-609 €53 659 
814 954 20C5-2006 1029-1030 113C 
1164 1205 1250 1294 1305 1320 
1327 1397 2432 1498 2507 1615 
1640 1694-2695 1763-1764 1767 
1779 



infant brain 



Columbia 
University 



IBS00I 



10 12 129 275 279-261 322 334 
372 446 551 563 623 652 667 669 
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BNSDOCID:<WO. _ 0153312A1. 1 ; 



WO 01/5331? 



Tissue CrioiK I RNA So; 



;rce | 



Kysec | 
Library Name . 



SEC ID NOS: 



lung, 
fibroblast 



Stratftcene 



l.FfcOOj 



lung tumor 



invi troaen 



lgtoo; 



(71 -672 819' 94 9 "966 
a: SI 11 8B 1191-1194 



1258 1265 
1324-1325 
1448 1471 
1562 1569 



1271 
1342 
1482 



1113 13?. 
1196 127* 
1207 1317- 133 4 
1423 1440-144'. 
1525 1532 154' 



| 1647 1649 365fc 



1588 1591 1610 363? 



5^9 17 20-21 25 6 9-"6S 82 94~T0~5~ 
153 157 197-158 203 207-208 212- 
213 223 262 2 GO 233 302 321 32£ 
333 356 370 427 430 436 44b ^62 
472 493 498 503 516 519 527 525 
537-540 542-544 562 565 S67 566 
5^9-600 607 615 630 647 662-664 
652-694 712 719 74 5 748 775-777 
794-796 810 837 843-847 849 *<54- 
6S6 869 e76 903 934 953 955-956 
964 975-976 984 1000 1005-3C07 
1024-1025 1033 1039 1053 106'. 
1070 1072 1082 1112-1113 113': 
1136-1138 1140 1195 1223 123; 
1233 1246 1279 1285 1295 131. 
1320 1334-1335 1343 1427-1424 
1446 1478 1402 1493 1504 153'. 
3552 1555 1567 1575 1582 1591 
1620 1625 1632 1638 1645 1654- 
1655 1662 1680-16B1 1684 1681 
1690 1696 1702 1711 1733 174. 
1760-1761 1778 1785 



5-10 18 20-21 29 33-36 40 4? 52 
54-55 61 65-66 68-70 73-75 80 85 
8B-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-356 159 161 
164 169 173 179-380 1 85 190 1 92 
194 196-195 203-208 210 212-214 
216-217 215 222 233 240-241 244 
246 251-252 2S5-256 261-262 266. 
272 276-277 279-281 284 266 2ee 
290 295 298 301-302 309-312 ?l7 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-373 176 
380-361 384 389-390 398 400 4 09 
414 423 426-427 430 432-436 4 43- 
444 450-451 454 462 468 472-477 
480-483 487-468 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 572-576 
585-586 588-589 595-596 603 607 
611-612 615 619 621 623 626 630 
632-633 644 647 645 651 655-656 
660 662-665 667 669 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-785 787- 
789 791 BOO 8C2-803 809-812 634 
824 826 828-829 832 838-839 643- 
845 849-850 852-855 657-861 864 
866 874 878-880 882 887 890-893 
897-898 902 504 906-907 910 936 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 955-9S6 
961 963 966-967 969 971 977-975 
9fil 984 986-967 990 992-993 995 
997 999-1001 1005-1007 3005- 
1012-1013 10ie 1020 1022-1C24 
1026 1029-1030 1033 1038 1041 
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BNSDOCID <WO.__0153312A1J_> 



WO 01/SJ3J? 



PCT/USOO/34263 



Tissue Origin RNA Source 



iympnocyt ee 



ATCC 



Hyseq 
library Kane 



SSQ ID NOS 



LPC001 



T045 2 04 7-2 DSO 1C5 2 1C54-10E' 
IOS9 1063-1064 1067-10": 107l->- 
1074 :C78 1085 1087 iocs :o?i- 
1097 1104 1106-1107 ncs 
1116-1117 1119 1126 ll34-;235 
1135 1141-1142 1144-ii^ L :14b 
1152-1153 1156-1158 1167 :170 
1172 U78 1195-1196 1156-1200 
1202 1204 1208 1214 123( 1215 
1222 ^227 1234 1241 124" }2S2 
I 1257-1258 1265 1267-127C 1 27e 
1278 :280-128l 1283 1265: I28fc- 
1285 1295 3300 1305 1308 332; 
1317-1321 1329 1338-133$ 1341 
1344-1346 1349-1351 1353-3355 
1357 1365-1366 1369 137P-137S 
1383-1385 1394 1357 140C 1402- 
14.03 1408 1417 1419 1423-1426 
1431 1433-1436 1438 1444 3446- 
1448 1454-14SS 1460 1466 146E 
1470 3474 1480-1481 1463 14Bfe- 
1488 1490-1491 1494-149C 150( 
1508-1509 1511-1512 1515-1516 
1519 1523-1524 1528-152? 1536- 
1540 1546 1549-1550 1555 2560- 
1561 1565 1567 1569 1575 158* 
1591 1593-1594 1596-155£ 3600- 
1602 1608 1614-1636 1616 362C 
1624-1625 1627-1632 1636 363* 
1644-1645 1647-1649 1652-3653 
1656-1662 1664 1666-1667 167C- 
1671 1673-1675 1676-1675 3683 
1685-1668 1690-1692 16^61€95 
3705 1705 1716-1717 1722 1727 
1730 1735 1739 1741 1743-17<4 
1748-1749 1753 1760-1762 3765 
1767 1770-1771 1773 1775-177* 
1778-1779 1786 



4 11-12 18 24-25 30-33 4fr 50-51 
56-57 68-69 80 92 98 103 3C5 110 
226 137 152-353 157 i€i 172 188- 
189 157 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 260-281 284 300-301 321 325- 
326 339 348 352 357 371 ?62 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-446 451 454- 
455 475 503 516 526-527 530 537- 
540 545 556-560 563 574 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 €90 697 
717 723 755 764 775-777 780 786 
789-79C 793 800 802 822 E38 649 
866 86S 876 881-883 852 &98 906- 
907 911 921-923 928 575 590 992 
996 1O01 1004-1007 1033 1050 
1054 1C78 1107 1135 1140-1141 
1143 2348 1158 2163 11' 
1205 1216 1226 1231 123C 
1244 2250 1258 1260 1265 3265- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1365 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
3470 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1592 1596 
1600 1603-1604 1606 1627 1636 



129S 
1241 



J 33 



WO 01/5331: 



PCT/US00/34263 



Ti^ue origin 



leukocyte 



PJvA Source 



j 



G I BCG 



Hystc 
rary N 



:/jc:oc 



KKQ ID NOS : 



jf?e 1647-1649 1*51 165£-365? 

i664 1676-1G77 1680-1681 1687- 

1660 1699 2711 1735-1716 1726 

1726 1737 1740 1746 1748 17S2 

1756 1756 1777 1775 



3-4 10-11 13 1S-1B 20-21 24-25 
30-21 35-36 40 43-45 48 50-51 
54-56 60-63 68-69 75 79-80 82-83 
65 68-31 93-96 98 100 103-104 
107-108 112 116 319 123 125-128 
134-140 142 147-145 151 153 155 
157 162-163 167 169-172 174 177- 
179 :86 19C 192-199 203-207 210 
212-215 237-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
271-277 28C-281 285-286 297-301 
307-310 333-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
351-358 370-371 380-385 387-388 
400 405 40P-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-4S3 453-454 456 459 461- 
464 468-412 474-479 481 483-485 
487-491 496 499-SC1 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 5S3-559 
S66-567 571 574-577 579 582 584- 
586 589 592 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
62S 633 636-637 642 644-650 65S 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
736-739 743-746 749 751 753 756 
759 765-766 768 770-775 780 784- 
786 788-790 793 796 793 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 B45-860 
863-864 86C-071 877-879 881-892 
894-896 696 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
962 964 967 970-971 973 975 977 
985-990 992-993 995-996 999-1002 
1004-1009 JCZ1 1014 1017-1019 
1022-1023 3C25 1027 1029-1031 
1033-1036 1036 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 l06e 1070 1072 1078 
1085-1086 3C89-3091 1093 3097 
1106-1107 1110-1113 1115-1117 
1122-1123 3125 1129 1132-1133 
1135-1137 1340-1145 1152 1158 
1163 1168 3170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 1254-1265 1269- 
1270 1272-1275 1277 1280-1284 
12B7-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-3335 1339 13^1 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 
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BNSOOCID: <WO 0153312A1J_> 



WO 01/53312 



J'CT/USOO/34263 



r 3 r- sue Origan j PNA Sour c e 



hysrv 
Liorarv Kane 



leukocyte 



melanoma from 
cell line ATCC 
tJCRL 1424 



mammary gland 



Clontech 



U1C003 



"Clontech 



MEI.004 



Invi trogen 



MMG001 



SEO JD WOS : 



145 3 


T4S6" 


1459 


i'<e'3'- 


~14~6T 


1466 


1470- 


1473 


1474 


3<77 


1478 


2482- 


1488 


1450- 


1493 


1456 


1501 


1S04 


2506 


2SC9 


3522- 


1513 


2526 


2519 


1521- 


1522 


1524- 


1 525 


1527- 


1526 


1531 


1534 


1538 


1S41 


1545- 


1547 


1549- 


1550 


1553 


1 555 


-1556 


1560 


1565 


1567 


1575 


1580 


1589 


1591 


1594 


1596 


1598 


1600 


-1602 


1606- 


1608 


1G11 


1614 


2620 


-1621 


1624 


1626- 


1629 


1631- 


1632 


1636 


2638- 


1639 


1642 


2644- 


2645 


2648- 


2650 


1653- 


1655 


16S8- 


1660 


1662 


1669- 


1670 


1675- 


1679 


3 684 


-2688 


1690- 


1692 


1696 


1700 


1702 


1707- 


1709 


1711 


1716- 


1717 


1720 


1723 


1725- 


1727 


1733 


1737* 


1738 


1741 


1743- 


1744 


1748- 


1749 


3 752 


1755 


1760- 


1762 


1765 


1769 


1771 


-1772 


1781- 


3784 


278f 











"4^35-36 44-45 61 68-69 75 B2 102 
219 139 154 179 197 244 280-281 
324 372 404 430-431 455 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 679 698 764 
773 775-777 80^ 848 851 856-857 
879 905-907 935 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1591 1600 1613-1614 1622 
I62e 1670 1676-3677 1691-1692 
1699 2733 1738 1772 



25 35-36" 43 80 304 126 128 150 
163 166 188-18S 397 210 215 220 
271 277 280-281 310 317 336-336 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
462 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 84S e56 859 869 87B 
883 887 905 914 932 934 958 976 
985 990 992 999- 1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-11C2 1107 1136-1138 1149 
1156 1163 1172 1290 1295 2200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 12781290 
1293 1321 1320 3330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1424 1423 1437 Z442 
1465 1521 1529 2536 1539 1541 
1547-1548 15B2 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150- 
166 170-172 
188-190 194- 
222 224 227- 
252 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60- 
80 82 89 98 
123 128 133 
152 154 158- 
174 176 178 
198 201-206 
228 231 233- 
256 261-263 
279-281 284- 



25 25 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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RNA Source, j Hyseq 

) kiorary Name 



SEQ ID NOS : 









*? G n 
y,y\j 


297 2 


95 3C1 


3 04 


3 0 C - 


312 316 








32 0- 


221 3 


23-325 


3 27 


- 1 o c 

- j <i y 


331-332 








334 


23? 34i 3 4'i 


- 3 4 5 


348 


s z» V 3b>t 








35 9 - 


560 362-363 


^08 


3 71 










303 


3GG 3 


90 353 


- 3 95 


3 97- 


3 96 4 0^ 








4 C6 


4 12 4 


14-415 


4 23 


4 30 


434 - 437 








4 41- 


444 448 453 


- 4 55 


462- 


4 64 474 








4 76 


479 4f2 4R5 


- 4 86 


4 86 


A Q(\ A O A 








4 S5 


498 5 


03 506 


5 09 


-512 


SI 6 - 517 








51 ° - 


£20 5 


12 527 


529 


534 


C~i 1 CA 1 








• 547 


549 554 517 


562 


572 - 


C"7 A Can 








589 - 


591 5 


57 602 


6 07 


618 












632 634-640 


6* 4 




b ** O bDU - 










655 657-650 


O b U 


fee 
ODD 










b/2 


674-676 67S 


6 82 


6 88 


6 95 - 696 








706 - 


707 710 713 


717 


720 


722-73 0 








732 - 


734 736 736 


74 3 


747- 


748 750 








755 


759 761 766 


770 


780 


7B4 706- 








789 


794 0C3 OOC 


- 807 


809 


614 817- 








B22 


827-82? 837 


842 


854 - 


858 8S3- 








864 


866 8€?-67C 


872 


878 


881 889 








893- 


900 904 906 


- 907 


911 


916 919 








921- 


923 926 935 


- 937 


946 


948-949 








953- 


954 957 960 


- 961 


963 


965-966 








970 


977-978 984 


-989 


9 93- 


997 








100C 


1001 


1005- 


10C6 


1008 


1013- 








1014 


1016 


-1017 


1023 


1025 


1027 








1032 


-1033 


103 6 


1039 


1043 


1045 








105S 


1 057 


-1058 


1063 


1068 


-1075 








1077 


-1078 


10B5 


1087 


1089 


-1091 








1095 


-1102 


1307- 


11C8 


1112 


-1119 








1121 


-1123 


1131- 


1133 


1136 


-1137 








1139 


-3142 


1144- 


1145 


1148 


-1145 








1153 


1159 


1167 


1170 


1172 


-117? 








1183 


-1185 


3190- 


1192 


1196 


-1195- 








1207 


-1208 


1215 


1216 


-1218 


1222 • 








1223 


1225 


1231 


1234 


1240 


-1241 








1247 


12S3 


-1254 


1258 


-1259 


1263 








1262 


1270 


-3280 


1283 


1285 


-1286 








1298 


1307 


1314 


1316- 


-1320 


1323- 








132£ 


1330 


1334- 


1335 


1342 


- 134 ^ 








1349 


-1352 


13S4 - 


1355 


1 359 


1365 • 








13 70 


1377 


1379 


13 81 


1383 


-1384 








1 3 85 


1405 


1414 


1419 


1421 


-14 23 








14 25 


-3426 


1428- 


3 4 29 


1431 


14 34- 








1437 


143 9 


1448- 


14 49 


1454 


1 A c — 








14 60 


- 1464 


1466 


14 71 


14 80 


1 A O 1 








14 87 


148S 


^1491 


1 H 7 0 


i enc 

JL 3U5 










1512 


3519 


1526- 


1528 


1532 










1536 


1539 


1542 


1547 


1549 


- 1550 








1554 


1561 


-1562 


1S64 


1567 


1 17Z 








2.576 


-1579 


isei- 


1582 


1587 


- 1586 








1592 


1594 


1596- 


1597 


1601 


-1602 








1607 


-1608 


1610 


1612- 


1616 


1616 








1621 


-1622 


1625- 


1626 


1631 


1635- 








1636 


1641 


1643- 


1644 


1647 


165C 








1652 


1654 


•1655 


1657- 


1658 


166C 








1662 


1664 


•1666 


1669- 


1671 


1673- 








1674 


1676- 


1677 


1680- 


1685 


1689* 








1692 


1701 


1706 


1713- 


1715 


1719- 








1720 


1723- 


•1728 


1730-1732 


1736 








1740 


1742 


•1744 


1746-1747 


1749 








1751 


1753 


1760- 


1762 


1765 


-1768 








1771 


1774 


1776- 


1777 


1779 


1783- 








1784 


1766 










induced neuron 


Strateaene 


KTD001 


29 3 


5-36 80 116 


123 


156 163 181 


cells 






214 


230 28C-281 


284- 


285 307 321 








330 


340 358 371 


3 75 


377 380 382 








422 424 492 497 


532- 


533 542 546 
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Tissue Origin j HNA Source 



Hyseq 
I Library Name 



retinoid acid 
induced 
j neuronal cells 



St rategene 



["'neurone Tcel Is I Strategenc 



pitui tary 
gland 



placenta 



NTUOOi 



Clontech 



Ciontech 



prostate 



rectun. 



Ciontech 



PIT004 



PLA003 



PRTOO} 



invi tr.ogen 



REC001 



5EO ID NOG-. 



5<:V SKf> 586 595 612 fe4 5- KTTs'C 
73< 775-778 780 792 799 823 826 
85fc 858 675 936 953 965 990 592 
2D',:-1043 1055 I0?2 2104 1193- 
1194 1206 1223 1246 1253 1274 
1286-1289 1291 1294 1311 1320 
l3<-9 1355 1412 1423 1485 162C 
161? 1645 1684 1705 1715 1751 



5-fc 78 268-269 277 383 431 506 
621- 677 731 999-1000 2159 242S- 
1426 154' 



29 <5-66 80 82 110 125 146 IS?. 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
3S2 393 406 414-416 4 54 465-466 
47C 488 503 506 510- S12 519 537- 
540 572^574 597 602 607 623 647 
663 700 702 716 743 771 792 858 
904 948 954 977 100C 1005-1006 
1025 1064 3068 3122 i:48 1185 
1219 3226 1234 1246 1271 3283 
1295-1296 1311 1317-1320 1329 
133C 1350 1355 1365-1366 137e 
1383-1384 1400 1412 1445 1505 
3535 1547 1578 1647 1656 1683 
169C 1738 1749 1783-1784 



311 314 379 408 419 430 454 1055 
1095-1096 1272-1273 : 3 12 2320 
1376 1652 1671 1720 1725 3736 
2 74: .175E 



5-8 124 208 277 370 843 906-907 
3200 1337-1319 3359 3609 1621 
1 73 V 



9 46 57 71 107 147 171 177 197 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 616 649 657-6S8 662- 
664 710 729 767 771 789 820 B63 
871 874 890-891 905 938 945 963- 
964 998-989 1002 1025 1033 1045 
3061 3095-1096 1112 1325 1142 
1196 1198 1202 1232-1233 1241 
1256 1272-1273 1287 12S5 1313 
1333 2341 1344 2349 1360 1362- 
1363 1367 1437 1442 2447 1475 
2476-1479 1482 1489 1523 2517 
1527 1531 1536 1598-3599 l62e 
1636 1657 1680-1681 1687-1686 
1717 1738 3743-1744 



"l7-38~29 33 62-63 71 73-74 83 86 
113 226 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 665 
717-719 721 725-726 730 748 750 
756 762-763 766 770 774 790 819 
825 643 849 851 881 903 909 548- 
949 560 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
11D8O109 1213 1130 1239 1153 
1159 3172 1178 1185 1187-1189 
1205 3220 1225 1240 3244 327: 
1317-1320 1323 1334-1335 2350- 
1351 2355 1369 1373 1375 1425- 
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BNSDOCID; <WO 0153312A1J. 



WO 01/53312 



PO7US00/34263 



Tissue Cncjn ! SN'A Source 



H y sec- 
Li brery Name 



SEC 3D NOS : 



salivary gland | CicnUcn 



salivary gland 1" Clonte.rh 



r.kin 
f ibroblasL 



ATCC" 



skin 
fibroblast 



skin 
fibroblast ! 

intestine ■ 



ATCC 



"~Cj ontp'rF 



skeletal 
muscle 



' Clontech 



T Clontech 



SALOOi 



S/;Ls03 
"SFBOCT" 



SF3002 



SFB0C1- 



SIN002 



£KM003 



SKM002 



: 5 6 

> j e 

;■€ 5 
Vfci 

fc 



1436 143 9 1469 3 4 74 "~24~ 1Z 
1546 1587-3&BB 1SS2 15>S( 
1622 1627 3644 iCSO 166; 
1666 1669 1675-1677 1745 



•3 5 

jC o 



:73 



5 97 103 110 140 149 152 158 ~ 
237-718 242-243 256 301 30fc 
321 333 351 354 360 410 42"/ 
473 487 494 496 S03 S35 555 
570 572-573 590-552 624 636 
759 762 764 768 773 788 800 
826 848 865 879 506-907 92S 
563 1026 1020 102S 1040 1C46 
1066 1103 1150 1172 1181 
1281-1282 128B-1289 1298 
1320 1333 1336-1337 134< 
3373 1379 1424 1447 2445 
1482 1492 1494 1496 1513 
-1524 3537 1554 3596 2626- 

1636 26S2-1655 1658 1665 
-1672 1691-1692 



i 5 k 326 1423 1463-146* 



^0 14 00 



2t:. 736 1025 1253 



705 3119 1350 1631 1653 



24 4 

3 r j 

< Ofc 

4 34 - 
F 3 5 
563 

i 26- 

1 3 8 

5ce - 

3007 
3 085 
3 1*9 

2 2 75 

3 3 4 5? 
3 4 03 

: bci 

JC36 
1 hh2 
3 7 04 
3 7 26 

i *; € 2 



42 146-147 151 
260 271 280-281 
302 308 312 334 
412 414 416 423 
435 445 452 454 
521 523 543 
£69-570 585 
629 632 650 659 
750 764 780 798 
£66 887 892 894 
907 912 919 935 
-1008 J026-102B 
1097 1116-1117 
1199 1219 1234 
3316 1320 1326 
1351 1374 
1407 1423 
1521 1550 1556 
1638-3635 1645 
2671 1675 1684 
1721 2717 1719 
1729 1733-1734 
1767 1780 1785 



547 

592 



1387 
1428 



155 198 2C- 
286 2B8 298 
340 371 398 
426-427 430 
478 503 516 
54 9 555 559 
604 Gil 626 
681 710 714 
829 842 857 
e95 901 904 
997-998 3000 
1044 1055 
1131 1148 
1547 126* 
1341 1345 
1398 1400 
1468 149£ 
3585 1597 
3653 1656 
2691-1692 
2722 2725 
1743-1744 



3> 2 0-21 82 84 101 
U: 153 166 225-226 
7iS 329 362 412 414 
4^5 470 488 503-504 
66 0 673-675 715 773 
5< I S22 950 963 982 
3047 2063 1115-1117 
2226 1268 1284 1298 
J326-1337 1343 1409 
1509 1599 3624 1644 



118 234 14fc 
256 274 277 
424 440 452 
S37-540 647 
780 786 83C 
99C 992 1020 
1121 1134 
1321 1325 
1413-2424 
1653 1712 



skeletal 
muscle 



36t 1683 1712 



skeletal"" 
xno scl e 



Clontech 



SK.MS03 



235-236 1409 



skeletal 
mi s ci e 
spinal c o*^ 



Clontech 



Ci ontech 



SXtts04 



spcoo: 



23 5-236 



i 21 17 30-32 35-36 43 46 6C 
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BNSDOCID: « WO._ 01S3312A1J_> 



WO 01/53312 



PCT/USOO/34263 



Tassue Origin - |~ RNA Source 



Library .Main^ j 



SEO ZD NO. c 



adult spJeerj 



stomach 



C] nntecti 



Clont ecb 



thalamus 



Cl on tech"" 



FPes s"2 — 9 4 ice 110 ii6~n?~T57 

jf/ 19E 204-205 210 21B 22S 25( 
f T f 5- 277 280-283 300-302 304 3 3 e 

■ S3" 372 379 3e7 392 419 426-42^ 

43C 433 448 467 473 467 489 50t 

505 513 S±9 524 526 537-540 54> 

t^'i S4 9 551 559 567 56 9 -570 5S1-. 

£>07 S16-617 623 625 637 649-65C 

6 52 657-658 670-671 673 679 681- 

61 1 709 711 715 719 726-729 73', 

74 5-750 753 775-777 782 709 751 

805 820 832 834-836 847-849 854 - 
?lb 858 661 864 871-872 87S 8*4 

85 h 906-908 917 919 924 934 94; 
944 970 985 990 992-993 998 1013 

1039 1053 1059 1065 1072 107b 

1077 1082 1095 3097 1103 3105 

1116-1117 1128 1134 1151 117C 

1174 1192-1194 1215 1225 1241 

1243 I2e3 1294 1307 1312 132C 

1323 1327 1330 3350 1353-1354 

1356 1359 1368 1375 1400 140b- 

1407 3423 1429 3437 1443 1446 

2454 1470 1482 3492 1501 1S0P 
1511 1529 1538 1548-1549 1565 

1571 1578 159B 1600 1614 1625 

1627 1630 1639 1646 1651-1651 

1670 1686 1696 1740 1751 175 1 . 
17 73 



SPLcO: 



STOOOl 



7:7 33 2 326 348 424 426-427 421 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 3«4£ 
1536 1579 1669 1686 1739 1767 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
26: 287 291-292 302 312 3S8 362 
426-427 430 446 462 47S 479 S35 
597 €20 630 652 662-664 722 739 
760 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1C7I 1135 1170 1208 1234-323^ 
l'/l-.S 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1476 
1467 1493 1498 1557-1559 1622 
1634 3651 1653 1729 



thaoo; 



thymus 



9 11 25 85 87 112 137 146 180 
190 298 206 210 212-213 235-23C 
235 261 268-269 279 290 301 325 
33«-334 341 351 356 364-365 379 
386 393 396 419-420 441-442 456 
477 483 500 525 531 549 567 606 
606-609 647 681 71S 725-727 736 
77< 782 784 794 827 883 890-891 
e99-900 961 997 999-1001 10O4 
1034 3055 1097 1129 1144-1141 
1130-3151 1157 1172-1173 1177 
1193-1394 1208 1220 3249 128C 
13C5 134S 1355 1369 1434-1435. 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1667- 
1668 1703 1743-1744 1746-1747 
175? 



Clontech 



THM001 



44-45 54 S7-59 62-64 79 104 12; 
226 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 526 535 537- 
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BNSDOCID <WO 015331?A1_I_> 



WO ill/53312 



K"r/US00/J4263 



Tissue Oiacuu 



RtfA Source 



Clontech 



Kyr.ec 
Libryrv Name 



SEO ID MCL 



540 


5*6 


54 8 


554 


56"; 


"584 


"?b6~ 




S91 




612 


621 


638 


- 640 


845 


-647 


649 


656 


66 0 


655 


670 


69e 


7j 0 


720 


720 


7:5 


735 


746 


759 


762 


766 


-767 


775- 


77 7 


780 


784- 


7E5 


800 


£02 


809 


624 


82 6 


828 


845 


85] 


858 


-659 


864 


866 


870- 


871 


87B 


88< 


887 


895 


899- 


900 


92 7 


930- 


931 




983 


S86 


990 


992 


999 


101< 


2029-3 


030 


1033 


1059 



1066 


1 073 


1103 


1107 


1113 


1316- 


1117 


1119 


2240- 


1142 


1158 


1163 


1172 


1177 


1195 


1206 


1209 


1213 


1216 


1228- 


1219 


2223- 


1222 


1227 


1271 


2 277 


1 282 


1320 


1329 


1349 


1367 


i 369 


3383- 


1384 


2417 


1419 


1423 


Z 425- 


l',27 


144E 


1477 


use 


1493 


1 536 


3554 


1620 


1644 


1646 


1649 


1 654- 


1655 


2661- 


1662 


1669- 


1670 


16 74 


1676- 


1677 


1685- 


1688 


1707 


1711 


1731- 


1732 


173'. 





5-9 1S-21 25 33 35-36 43-45 Ad 
50-51 54-55 60 75 £3 87 85 93 
98-100 102 105 112 117 135-137 
1 141 143 146 157 1C7 169 192 196 
212 217-219 222 224 229 233 235- 
236 240-241 244 253-252 256 261- 
262 268-269 286 286 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 36C 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 486 
497 500 504 506 513 526 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 63C-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 7C3 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 820 823 829 834-836 841 848 
854-856 859 861 864 870-671 881 
890-891 898 508-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 2014 1016 1035 1043 1073- 
1079 1089 1097 1309 1114- 
1122 1131 1140-1142 1144- 
1163 1172 1175-1177 1186 
1198 1206 1211 1216 1220 
1227 1234-1243 1261-1262 
1271 1280-1283 1284 129C 
1317-1320 1322 1324-132S 
1330 1334-1335 1335 1346 
1 1350-1351 1355 1357 1360 1370 
i 1374 1377-1379 1386 1385-1390 
j 1392 1357 1400 1402 1406-1407 
I 1417 1423 1425-1427 1440-1441 
I 1466 1474 1477 1483 1493 1498 
! 1504 1506 1525 1536 1545 1549 
| 3566 15S4 1598-360C 1608 1611 
t 1614 1621 1623 1625 1632 1639 
| 3641 1644 1647 1649 1653-1656 
I 1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 2707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 



1074 
1217 
1145 
1196 
1223 
1267 
I 1308 
| 1327 
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BNSOOCID <WO 0153312A1J_> 



WO 01/533)2 



PCT/US00/34263 



|" Tie'cue Origin 
|~t.fiy 



xNA Scuicc 



SEO ID NOS : 



| Library Name 

.4 -l-sooT 



trachea 



Clontecn 



TRc?01 



4 9-10 ;0-22 37-29 48 50~5l~~54- 
57 60-Cl 65-6f : 71 83 94-96 58- 
100 101 104 110 112 115-117 119 
123 127 133 13G-137 140 149 152- 
153 155 1S8 163-164 1G8-169 171 
186 190-192 1 S7 201-203 219-220 
229 232-237 246-247 253 256 258 
262 265-266 26E-269 277 280-281 
284-286 288-285 298-299 302 3C5- 
311 317 321 326 332 335 341-342 
344 34e 350 354 356-359 363 36B 
371-373 382-383 385 394 39B 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-454 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 525 535-540 547 549 
562 564 569-570 575-576 5BB 594- 
595 601-602 604 606 620 612 615- 
617 619-623 628-630 634-635 642 
641 649-651 660 662-665 668 670 
681 690-654 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 76a 763 765 770 773 
780 785 7?5-796 798 802 804 823- 
824 826 828 833 638 841-845 847 
849 857-860 867 e74-875 878 8BC- 
881 887-868 890-892 894-895 898 
908 910-911 913-514 922-923 926- 
927 929 ^32-934 537 939 941-942 
948 953 967 961 963-964 966 978- 
979 981-982 937 990 992 1001 
1004-1006 1013 1014 1020 1024 
1033 103^-1039 1044 1047 1050 
1052-1054 1056 1058 1060 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1242-1143 1146-1147 
1149-1150 1156 1161-1164 2167 
1170-1173 3177-1182 1190 2192 
1197 120C 1204 1208-1209 1214 
1217 1219 1222 1230 1232-2233 
1235 1241 1245 1247 2254 1257- 
1258 126C 1262 1271-1273 1283 
1286-1285 329S 1306 1314 1320 
1330-1332 1334-1335 1342 2345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 24281436- 
1437 1440-1442 1443 1446-1449 
1454 1459 1462-1462 1468 1470- 
1471 1475 1477 1475 1482 1491 
1497-3498 1504-1505 1507 1523 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 3550 1553 1555- 
1559 1562 2567 1578 1590-1591 
1597 1599-2601 1612 2614 1616 
1619-1620 1622 1624-1626 1628 
1632-2632 1634 1636 1639 1644- 
1645 1646 J651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1676-1681 1683-2686 1689 
1691-1692 1703 1?09-1711 1717 
1724-1726 1729 1734 3737-1738 
1740 2743-1744 1749 1753 2759- 
1761 1770 1777 1786 



9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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PCT/USD0/34263 



Tissue OtJcjn 



SNA Sou? c* J Hyscc 

| Library Kdine 



CI on i c-cr- " 



UTFOC J 



SEO ID NOS 7 



352 372 277 3£4 A 14 4 24~44 h~ 446~ 

454 472 474 491 496 560 579- 588 

£93 597 607 612 626 6PI 702 719 
810 955> fc66 B78 894-895 912 916 
S22 332 93S 1046 1075 1080 1099- 

1102 111? 1208 12ZS 1232-123? 

1237 12f; 2312 1385 I3e7 1405 

1414 1424 1430 1437 1447 1S05 

1569 1575 1586 1600 1641 1653 

1667 167j 1676-2677 1683 26S1- 

16 92 17 11 1717 1726 1772 
TV" 19 25 <1 4 6 57-58 61 8 5 " 104 

ICE 139 2^2 174 198 200-201 206 

263-265 274 290 387 408 420 43fc 

446 446 452 4 73 491 493 499 503 

506 513 £39 522 526 530 542-543 

560 601 610 632 659 665 "20 751 

773 780 833 845 857 072 877 912 
925 934 937 996 10091011 1018 

1050 1075 1107 1124 117C 1219 

125B 1279 1287 1310 132C 1323 

1343-11144 1375 1437 1451-1452 

1478 1461 3498 1519 1521 1536 

1552 1579 1597 1602 1606 1620 

1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TR/\DOCS: 141619) )(%CQND1 ! DOC) 
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BNSDOCID: <WO 01 5331 2A1 J, > 
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TABLE 2 



SEQ 
IV 
JVC . 


l too J vJ-> 
jv UrlDLr. 

~Y41736~~ ~ 


SPEC j ES' 


DESCRIPTION 


SMITH- ] r— 
KATERMAN | IDENTITY , 
SCORE ; 


Kon.c 
sapi ens 


Human PROllH proten. 
seouence . 


13 96 


IOC 




Y666S6 


Horn 
r>ap3 eni 


Metv.br ane- bound prott;i- 
PKC94 3 . 


23B9 


99 




AF113126 


Homo J.apiens 


1L.-3 i eceptcr-asscc: at.ee j 304 3 
kinase-M; IRAK- M ( 


rioc - 

! -— 




AF01780* 


MUS mU5C'Jlu£ 


2n~35 transcription iactoi 


6351 






X02761 


Home sapiens 


ficronectir. precursci 


1053b 


98 


t 


X02761 


Home sapiens 




8990 


89 


X02761 


Home sapiens 


f icroriectin precursor 


12564 






AUO lib 


Home sdpienf 


Rabfc GTPase activating 
protein, GAPCenA 


5253 


99 


] <-' 


w B o b 0 1 


Home sapiens 


Human stomach carcinoma clone 
HP1 04 IS - encoded protein. 


238^ 




1 00 

90 1 


1 1 


AFl 1 7754 


Home scpicns 


thyroid hormone receptor- 
associated protein complex 
component TRAP2 4C 


11336 


1 


- ' 


Z97630 


Home Scpiens 


d-J4 66W1.4 (novel pror.eir. 
similar to ANX3 (ar.kynn 3, 
node o£ Ranvier (ankyrir. 
G) ) ) 


89fc 


1 00 

f 


i :• 


Yb0620 


Homo Sauien? 


Protein regulating uen( 
expression PRGE-Ij 


1894 


QC 1 




AF213457 


Home 
s<ipjcn£ 


triggering receptor expressed 
on myeloid ceils 2 


1238 


1 00 


K 

1 '.' 


AF233453 


Home s^piene 


RACK-like protein PRKCBPJ 


3124 




AF201303 


Homo SAcicns 


dhtr oribeta- binding protein 
RIP60 


3130 


98 


If 


AF06 4 205 


Home sapiens 


dynactln 1 pl50 isoiorn 


6377 


1 00 


J c 


U00059 


Saccharomyce 
r, cerevieiae 


Yhrl21wp 


174 


26 


2: 


AB032902 


Homo sapiens 


guanosine moncphosphat < 
reductase isolog 


1801 


99 


?? 


AB032903 


Home sapiens 


guanosine rroncphospha t<- 
reductase isolog 


1485 


99 




AFl 4 0507 


Homo sapiens 


Ca2-* /calmodul in - dependent 
protein kinase kinase beta 


3083 


99 


2? 


AFl 4 0507 


Homo sapiens 


Ca2 + /calmoduJ. in- dependent 
protein Mitdsc Kinarc vt? l <-- 


2300 


99 


2< 


AJ289131 


Homo sapiens 


choncroicin 4-0- 

oUi X UliOJlblw i dot 


2213 


99 


2S 


U33460 


Home 
sapi enr 


DNA- directed RNA polymerase 
1 t largest subunit 


8777 


98 


2( 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1307 


100 




Ufl J ' V J. 


X-T^fllfv can! ij n c 


riJDcsomal protein L23a 


791 


100 


2t 


U02O32 


Homo sapiens 


ribcsomal protein L23a 


767 


97 




Y4 2 3 24 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


3083 


99 


31 


W71 74 9 




Human ubi qui tin conjugation 
system protein 2 


735 


9C ! 


3; 


W72 749 




Human ubiquitin conjugation 
system protein 2. 


631 


8? 


32 


AF23 1 917 




long-chain 2-hydroxy acid 
oxidase HA0X2 


1811 j 100 

! 


33 


Z29481 


Home sapiens 


3-hydroxyanthranilic acid 
dioxycenase 


1507 | 99 

i 


34 


AB0C1451 


Homo sapiens 


Sex 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Home sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


3586 | 78 
4 726 \ 93 


3fc 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


i 
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iS 1 Y7 87S5 


Hcmo s&piens 


Human ontiz.uai-2 <*s2.-2> tTiino 
acid sequence . 


3S56 


• 


4 C | U93121 


Homo sapiens 


M- phase pnoFphopye>tean- 3 


3747 




4Z ' Y42750 

1 


Homo sapif-ns 


Human calcium binding protein 
I (CaBP-1). 


79i> 


100 " | 

J 

100 ~ " ; 


4i 


AF28252fc 


liomo sapiens 


1 atexin 


1189 


4> 


G0215G 


Komo sapiens 


Human secieted protein, SEQ 
ID NO: 6231. 


384 


t ** j 


44 


U19617 


Mus muscui us 


El£-2 " " ' 


2724 


r«B — 1 


4 £ 


U19617 


Mus mus cuius 


Elf-1 


2062 


86 ~1 


4e 


AF10D756 


Homo sapiens 


osteoinductive ractor OIF 


1S38 


~- 

100 * 


4 1 


Y8 7592 
X04 14l> 


Homo sapiens 


Human SPROl/TY-1 protein, 5EQ 
IP NO:24. 


1737 


i 


4 9 


Hcmo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 1 

1 


5 3 


X6 3 54 7 


Homo sapiens 


oncooene 


504S 


99 I 


5/ 

~sl 


M94043 


norvegicus 


rab-related GTP-bindino 
protei n 


1089 


96 


L317B3 


Mil o rrnfifMil m c 


uridine kinase 


917 


L - 1 


54 


X83 973 


Homo sapiens 




4 '186 


98 


55 


AF2 24 741 


Homo sapiens 


chloride channel protein 7 


4128 


95 


51; 


W748D5 


Homo sapiens 


rlUITian ScClcLcQ fJ i Z> L c 1 Ij 

encoded by gene 77 clone 

nOfcjHij x 1 . 


1491 


100 


57 


Z509D7 


Homo sapiens 


Human ?BC- 2 cDNA Irom second 
transcript . 


4824 


1 oc 


58 


D79994 


Homo sapiens 


similar to ankyrin of' 
Chroma tium vinocurri. 


6089 | 99 


b« 


D79994 


Homo sapiens 


similar to ankyrin ot 
Chromatium vinosum. 


4014 


1 91 


6C 


Y59738 


Homo sapiens 


Humon ncrroal ovarian tissue 
derived protein 11 . 


601 


100 


61 


A8031 0€9 


Homo sapiens 


protein containing CXXC 


1390 | 100 

1 

• 


- 

62 


Y6 66 60 


Home 
sapiens 


Membrane -bound protein 
PR0783 


2492 


1 99 

i 


6 3 


I w O D D V 


Home 
sapiens 


Membrane- bound protein 
PR0783 


1709 


99 


64 


S7001 3 


K (1 \, L VI J> S>p . 


tricarboxylate carrier 


895 




6S 


AF139516 


Ra t tus 
norvegicuc 


A-kinase anchor protein 


178 


III — j 


66 


W29666 


Homo sapiens 


secreted protein. 


157 


30 


67 


AJ24S738 


Homo sapiens 


claudm-lb 


1206 


100 


66 


AF099136 


Rattvs 
norvegicus 


GLUT4 vesicle protein 


41B3 


... —j 


62 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4906 


Bt 


70 n 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 } 


72 


AF224278 


Homo sapiens 


PMEPA2 protein 


1282 


10C 


72 


AF126426 


Homo sapaens 


neurotrimm 


1809 


100 » 


73 


Y416S2 


Home 
sapiens 


Human MSK2 protean sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein-1 


14B5 


74 


"76 ■ 


AE0004 06 


Escherichia 
coli 


putative DNA topoisomersse 


9S0 


100 


77 


X9930i 


Homo sapiens 


Pop! 


65S 


100 


78 


AL,i36538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein | 


210 


31 


79 


AF129756 


Homo sapiens 


G4 ( 


1554 ! 


99 
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i uc. i\ Ally 


I 80 
I 

, 


Al.0yC76ts I Hcmo sapiens 


doe 58 HI 6 .2 
Cphosphati dyl serine 
decarboxylase (FSSC, EC 
4.1 .a. 65)? 


2 03 3 


1 00 

1 
i 


[-81 

1 

1 


Al*09676e 


Kumo sapiens 


dOBSSBie .2 
(phosphoti dylser.i nc 
decarboxylase (PSSC, EC 
4.1.1.65)) 


122C 


82 


X573S3 


Homo sapiens 


1-8D 


677 


i *t 


83 


AC005594 


Home sapiens 


R26984_ 1 


270C 


1 9fc 


84 


X73113 


Konc sapiens 


fast KySP-C 


C CjC c 


£9 


85 


AF09733C 


Homo sapiens 


HI chloride channel; p64Hl; 
CLIC4 


13CS 


99 


86 


AB016423 


lAJs mus cuius 


SH2 domaan- containing prcteln 


1" 13f0 


78 


87 


AF27215: 


Hume sapiens 


adapter protein C1K5 


30*-4 


99 


88 


AF: 96325 


Home, 
sap Jens 


triggering receptor expressed 
on monocytes : 


1214 


100 


89 


ABO 1 6879 


Arabi oops i s 
thai iana 


Contains similarity to pre* 
mRNA spl icing 
factor^oene id:MRB17.2 


634 j 36 

1 


90 


A J 1 "* 3 7 2 1 


MUS TDUSCUluS 


homeodemam protein 


65s ( 57 


91 


AJ242864 


Kus musculus 




61 f 


6: 


92 


A6I971 


unidentified 


MCSF 


11676 


i~95 


93 


Y£936b 


Homo sapiens 


Human PRO1250 (UNQ633) amino 

ariA «:/-»rr))*» r>/-o Q rn in \} r 1 ■ H £ 

tlLiU Ol. UUCiJCc O Cu il/ >V J. OO ■ 


3890 


1 00 


94 


YR723: 


Homo sapiens 


Human signal peptide 
containing protein HSPP-S 
<5RO 11) NO ■ ft 


1C3 I 


100 


9& 


AF22 7741 


Kattui- 
norveyicus 


protein kinase KNK3 


24 2 1 

I 


St 


9 6 


AF22 7741 


RattUi 
norvegicus 


f/tOCtJJJ A i J J JSC rrjx J\ j 




94 


97 


YS2S1? 


Homo sapiens 


Human 0XR£-1C 


1621 


100 


98 


AL021366 


Homo sapiens 


c x v. rv u / z i. v . j (m ncsin rciaceo 
pr o t e in • 


3423 


100 


99 


AC00SV33 


Homo sapiens 


R33083 


1974 


9f- 


100 


Y9S293 


Homo sapiens 


Human GEF containing NEK.-} ike 
kinase substrate sGNK. 


4 OS', 


9S 


| 101 


ALllSbOl \ Homo aapieno 


dJ1191Nl6.l (A novel protein 
(tranfiation of the cDNA 
DKFZp566A0946 , Em:AL050069) ) 


1501- 


IOC 


1 102 


AJC06267 


Homo sapiens 


ClpX-li'ke protein 


3 23? 


10C 


r 303 


AF1007S? 


Homo sapiens 


ancient utuouitous 46 kDa 
protein AUP3 


204; 


96 


104 


AB015982 


Homo sapiens 


serine/threonine kinase 


4711 


100 


aos 


AF151074 


Komo sapiens 


HS PC240 


8 3 i 


64 


106 


M35522 


Cam s 

f amiliari s 


GTP-binding protein (xab7) 


354 


50 


| 107 


R99800 


Homo sapiens 


NTlI-l nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


1 loe 


AF125533 


Homo sapiens 


NADH- cytochrome b5 reductase 
isof orn. 


1290 


9; 


109 


AC005614 


Komo sapiens 


F23269^; 


33Gf 


9r 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


32 8 f 


100 


111 


X52425 


Hono sapiens 


interleukan 4 receptor 


4 4 9t 


100 


112 


Y41686 


Home 
sapiens 


Human PR0274 protein 
sequence . 


22er 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1 . 


159: 


100 


114 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


3190 


99 


iis 


AL049548 


Hcmo sapiens 


dJ398G3.1 (orthoiog of ret 
CPG2) 


34 5 V 


95 


116 


AF189817 


ftus musculus 


evectin-2 


1124 


90 


117 


W30891 


Hcmo 


Human cytostatin III protein. 


711 


99 
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SMITE- 
VJATZRMAK 
.SCOPF 


IDENTITY 


"~i~iT 


AF 1 j 6 6 1 £ 


sapiens ; 




homo sapiens | I'KOlOJt 


146f^ I^IOC 
"174*8 tl'OC 


119 

~T2"( 


Y0fc«l« 5 


Homo sapiens j alpha 4 proLe::. 


AFC- ? 8 070 


Crosophila j Lisl Homo 5 09 
mel anogaster ' 


19; j 3S 


121 


AFC524 32 


Komc sapiens | kat.ar.in p8C subunii 


I ie: I 21 


122 


Y7074i 


Homo sapiens j PSEQ-I protein encoded by 
| NSEQ ger.« associated with 
I matrix model line . 


263-/ 


98 

> 


122 


AF063246 


Homo sapiens 


HSPC026 


2132 


100 


124 


Y270S< 


Homo sapiens 


human viral receptor protein 
IACVKP) 




99 


125 


M63105 


Leishmania 
m<i jox 


Glycoprotein 56-92 




27 


126 


U7546? 


DTOSOpUildL 

roelanogaster 


i Atu 


935 


36 


127 


Z6B220 


Caenorhabdi t 
is elegans 


£ j mi) ari ty to Human ADP/ATP 
cr.rrier proteir. 


43 e 


43 


128 


AFOC-5927 


R^t tus 
norvegicus 


protein phosphatase 2C 


1927 


I' 4 


129 


W92ybfc 


hemo sapiens 


. . - - 
Human zsin4 4 protein. 


463 


100 


130 


AF115391 


Lactobacilli 
s sakei 


ribokinaoe RbsK 


50C 


3'i 


132 


X934 96 


Homo sapiens 


22-Giutanuc Acid-Rich Protein 


125C 


100 


132 


X9349e 


Homo sapiens 


2 "j -Glut ami c Acid-Bich Protein 


91€ 


87 


133 


WS2913 


Homo saDiens 


Human DBI/ACBP -like protein 

(Dbih; . 


70S 


97 


134 


Y84444 


Hcmo sapiens 


Amino acjo oequer.ee of a 
human RNA- associated 
protein . 


3230 


ioo 


13 b 


M6 9 j 6 j 


Homo sapiens 


non- muscle myosin V 


18S ; 1 20 




W74 662 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


4 8C 


100 

1 


13 7 




Home sapiens 


Human secreted protein 
encoded by gene 75 clone 


655 




13H 


AL0I-2S2 0 


Homo sapiens 


dJ349A12.1 (similar to 
KIAA0701 protein) 


4 24 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


"TCP 


140 


X70? 94 


Homo sapiens 


zinc finger protein 


163'. 


100 


14) 


Y064 3 9 


Homo sapiens 


H-jT.sn protease IIUPM-8. 


936 


100 


142 


ZfcB4 93 


Caenorhabdit 
is elegans 


predicted oaina Genefinder 


365 J 42 


143 


AB01 8107 


Arabidopsis 
thai i ana 


ADP-ribosylatlon lactor-like 
prctein 


59fc 


6* 


144 


AF16 1483 


Homo sapiens 


HSPC334 


58C 


51 


145 


Y84902 


Homo sapiens 


A human prcl i lerat ion and 
apoptosis related protein. 


48C 


100 


146 


AB0C4 906 


Ipomoea 
purpurea 


transposasp 


146 


2C 


147 


AC0C7351 


Arabidopsis 
thai iana 


F3Fi9.ifc 


647 


31 


146 


W75:SS | 


Homo sapiens 


Human secreted protein 
encoded by aene 41 clone 
HNTME13 . 


1494 


98 


149 


AF0bb4 9D 


Homo sapiens 


cAWP- specif ic 
phoephodieEteraoe 8A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
KHH-7. 


785 


SB 


151 


U103 97 | Saccharomyce 
t s cerevisiae 


Yhrl46wp 


51b 


53 


152 


X734 7e | Homo sapiens 

i 


phosphotyroeyl phosphatase 
act i vatoi 


1719 


99 


153 


AL04 9697 | Homo sapiens 


d03B21l0.5.1 movel protein 


2034 


99 
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1 WATERMAN 
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NO: 








SCC RE 










similar to argany'i - 1 kNA) 






1S4 


AF16S602 


Kcrno sapiens 


cytochrome b5 reouc; ase bSR.2 


u— - 




ist 


X94701- 


Homo sapjens 


rab2*- 


[ 3I2t 


9 9 


~Tse 


Y257i( 


Homo sapiens 


Human secreted protein 
encoded from onne i . 


~JTT. 


IOC 


15f 


K7740< 


r. omo Saciens 


Secreted salivarv polypeptide 
zsig32 . 


53 ~ 


100 


159 


Yl724f 


Homo sapiens 


Human protein kinc^e 
inhibitor- 2 {pki - 2; . 




10c 


160 


J0497C 


Homo sapiens 


carboxypept idase M precursor 


239: 


10c 


163 


W54 04 C 


Homo sapiens 


Human 1 nt er 1 eron- ji. nducib.1 e 
protein, H1P1 . 


4 


98 


162 


AL022724 


Kotno 5dpi ens 


dJ4l3H6.l.l (hamster 
Androgen-dependent Expressed 
Protein like putative 
prctein) (isotorm 1) 


133". 


10 0 

1 

1 


16? 


AF1 ? u SI S 
*^ 4 .A- < s ~> -j -j 




r>r>91 homnlnc 


191- 


i 4 S 

1-15 


1 64 


G03 6 3 * 




11 Li 1UO it OC^JLC UCU ^JX ^ ^ C 1 1 1 / OI-V 

ID KO: 7713 


4 63 




16£ 


AJ2SC639 


Homo sapiens 


seri ne/:iireonine protein 
kinase 


144. 


TT 

I 


* 6t 


LO 964 9 


Z ymomo ft a s 
mobilic 


zm£ 




PV7 


167 


Y 7 3 3 3 7 


Homo sapiens 


HTRM Clone 194 4 530 protein 
sequence . 


1 204 


3 00 


16 8 


mooc « r 
WtJBb 4 . 


Homo sapiens 

- — . — ■ — [ 


Secreted protean encoded by 
yene 11 z cione nUAr l /j.. 


1084 


100 

I - ' 


A b d 


AF2 3 4 7 31 


Homo sapiens 


ATP- dependent RNA be li case 


- - 


1 00 


170 


AE000e73 


Me thanobacte 
t lum 

thermoautotr 
opn x cum 


conserved pxoLein 


16* 


27 


173 


Y276R4 


Homo sapiens 


Human secreted protein 


821 


| 100 


172 


AF2 2 6 04 4 


Homo sapiens 


HSNFKK 


2904 


100 




AJ24 5 54 6 


Homo sapiens 


neuroolofain 




1 oc 




E4 3 94^ 


Homo sapiens 


This cene xs novel. • 


3 2 0 j 


1 00 


17b 


Y 0 7 9 2 


1 1 r~irr\ r\ cam one 


u 1 f - jlm 1 1 vj J. J iy pi Ulc 1 1< 


1201 


100 


176 


W903 3& 


fcapi ens 


Unman HP! nftrariT nnup rsvrtt***ir 


96f 


1 00 


177 


Y416 7! 


Homo sapiens 


Human channel -relator; 


112; 


ioo 








molecule HCRM-3 . 






1 78 


Y41 6 74 


Homo sstpiens 


Human channel -rela ted 
molecule HCRM-2 . 


93( 


r 55 


"T79 


AF22 04 92 


Homo sapiens 


krueppe) - like zinc finger 
protein HZ?2 


410C 


99 


18C 


X030*4 


Kono sapiens 


C3 q R-chain precursor 


124C 


100 


181 


US734 4 


Kus musculus 


Meis3 


1813 


89 


182 


U573 4', 


Kius musculus 


Meis3 


174 3 


86 


184 


U57344 


Mus musculus 


Meis3 


107C 


66 


185 


AF033120 


Homo sapiens 


pS3 regulated PA26-T2 nuclear 
protein 


i3ev 


58 


186 


AF2C03S7 


Kvis musculus 


pantothenate kinase 1 beta 


1605: 


82 


181 


W750Sf 


Homo sapiens 


Hunan secreted protexn 
encoded by gene 2 clone 
HLDBG33. 


use 


99 


188 


AJ2S252 9 


Homo saviens 


suppressor of sterile four 1 


2424 


100 


190 


XS413* 


Homo sapiens 


protein- tyrosine phosphatase 


3701 


100 


191 


Y222C3 


Hcmo sapiens 


Human cal cium- binding 
phosphoprotein, CBPP-l, 
protein sequence. 


1083 


99 


192 


W636 9V 


Hcmo 
sapiens 


Human secreted protein 12. 


1971 | 


10c 


193 


WB777: 


Homo sapiens 


Human serum glucocorticoid^" 
reculated kinase (K-SGK2) 
polypeptide . 


2605 | 
i 


95 
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SKjTH- 
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SCORE 
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1 DENT 2 TY 

"5? 

! 


ISA 




WUS TTiUSOUluS 


or omocoma an- cont a inino 
protein EP7t 


1 9' 

""19 ' t 


Y0075? 


Ra tt us 
norvegicus 


^r;nc dehydratase <AA 1 - 
327. 


S'9< 




vj 9 s :*<j v 


H omo $> 3 p i e Tj s- 


Human foetal brain secretec j 2£5( 
protein fhl7D_7. j 


100 


19: 


AB028859 


Hon)C sapient- 


hDjb 


189C 


100 


1 9E 

1 9^ 


Vj 9 5 g 3 2 




Hcmo sapiens secreted protein 
gene clone hm236_l . 


1614 j 100 


Y4 4 27'y 


Home 
sapiens 


Human nucleic acid methylase- | 209(- | S9 

s. ' 1 


2 00 






hFACPLl 


225f | 106 


2 0l 


X54162 


Homo sapien; 


64 Kd autcantigen 


291f 


99 


202 


G02 06 1 


Home sapiens 


Human secreted protein, SEC 
ID NO: &14S. 


S5E 


99 


2 07i 



X 1 3 8 8 £■ 


Nicct 1 ana 
t aba cum 


trxtensin (AA 1-620) 


i8S j 33 

1 


0 04 204 


Bos t<i\irus 


32 Kd accessory protein 


183*. 


J_10C 


20b 


004204 


Bos t.Eurus 


32 kd accessory protein 


11 01 


100 


201 


V 8 7 2 6 3 


Home sapient 


Human signal peptide 
containing protein HSPF-60 
SEQ 3D NO: 60 . 


1311 


10c 


206 


Y0286C 


Homo tapieiis 


Fragment ot human secreted 
protein encoded by gene 65. 


936 


98 


209 


A1>1 2] 869 


Homo sapiens 


OJ1076E17.1 (KIAAC823 protein 
(ccntm-jes in AL023803)} 


694 


S4 


21C 


AF226732 


Homo eapiene 


NPD007 


134* 


I! 6 


2i: 


X66295 


Mus muscul us 


Ciq C charr. 


9 70 


73 


21? 


Z29328 


Homo sapiens 


Ubaqui t in-ccn}ugating enzyme 
UbcH2 


'J66 


100 


2i:- 


22932b 


Homo sapiens 


UBiqui tin-conjugating enzyme 
UbcK: 


b42 


98 


214 


AJO02O3O 


Homo sapient 


prcgresterone binding protein 


116? 


10 0 


21b 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3 933 


10c 


21b 


AF2SDSS8 


Homo sapiens 


claudin-? 




99 


21'* 


70,021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 


100 


2ie 


Y08S6S 
t"Y94452 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans lera 
se 


3331 


99 


21? 




Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035S21 


Arabi ccpsis 
thai i ana 


putative protein 


315 


42 


221 


AL031/86 


schizosaccha 
romyce F 
pombe 


putative prolme-trna 
synthetase 


611 


41 


222 


AL109736 


Schizosaccha 
romyce s 
pombe 


WD repeat protein 


626 


40 


223 | XS2493 


Glycine max 


D~NA- directed RNA polymerase 


] 36 


23 


224 1 AL03S6S9 


Homo sapiens 


OJ979N1.1 (d09V9Ml.l) 


S19P 


98 


22h | AB032401 


Mus musculus 


mmDj4 


1761 


92 


226 


AB0324O1 


Mus musculus 


mmD j 4 


1988 


92 


227 


X83502 


Saccheromyce 
s cerevisiae 


J1007 


112 


26 


228 


X83S02 


Saccharomyce j J100" 
9 cerevisiae J 


79 


25 


229 


AF143723 


Homo sapiens ) heat shock, protein KSP60 


2 = 57 


99 


23 0 


Y66677 


Homo j Membrane -pound protein 
sapiens j P3082fc . 


982 


100 


231 


AB027466 


Homo sapiens i spondin 2 


17S6 


99 ( 


232 


W9S634 


Homo j Homo sapiens secreted 
sapiens j protein. 


1391 


100 


233 


K00365 


Homo sapiens ; Human cyclin BI . 


2218 


99 


234 " 


■^53762" ■ ■ 


Homo sapiens ( A GTP-birta±ng polypeptide 


2 017 


100 
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SEC 


ACCESSION 


SPECIES 
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SMITH- 


r -7 


IE 


MUMBER 






WATERMAN 


IDENTITY j 


NO: 








SCORE 










designated RAO . 






23' 


ZEC74S 


Homo sapiens 


yeast 5ds22 homolcc- 


1800 


10C | 


2 3 V 


7.0C74S 


homo sapiens 


yeast sds22 homoior 


1 7 54 


9 e i 


i 23* 


AfcC26491 


Konic sapiens 


PICKj 


2137 


100 ; 


" 2 3* 


AJ2 70205 


En t oHinium 
ceudatun 


putative 

phosphatidylinositc] -4- 
phosphate 5- kinase 


114 


37 

i 

! 




ABC301B9 


Mus rous cuius 


contains tran^membi c.ne ITW 
region £nd ATP binding region 


710 


53 i 
i 


1 24C 


1*56 53 8 


Hcmo sapiens 


Human hedgehog interacting 
protein (HIP) . 


378b 


99 i 
1 

1 


i 


W5653B 


Hcmo sapiens 


human hedgehog interacting 
protein (HIP) . 


3436 


99 ! 


242 


AF J 55107 


Homo sapiens 


NY-REN-37 anticer. 


£56 


99 i 


h U:* 


AF 155107 


Homo sapiens 


NY-REN-37 antigen 


100S 


100 j 


24< 


AL031320 


Homo sapiens 


cJ20N2.j (novel protein 
similar to yeast ar>c 
bacteria] cytosJm 
deaminase ) 


763 


9? 

1 

1 


24b 


|TJ37C26 


Pat tus 


sodium channel beta 2 eubunit 


162 


1 30 






norvegicu* 








241 


ALC7 8 599 


Homo sapiens 


du99lC6.1 (novel protein 
similar to C. elegan£ 
F55A12.5 (Tr:PS1086>> 


23 9: 


96 

J 


247 


U3227 4 


Saccharomyce 
s cerevisiae 


Ydr3 86wp ; CAI ; 0.1' 


191 


i 


24 fc 


Y41719 


Home 
sapiens 


Human PK0864 protcir 
sequence . 


1C79 


100 ; 


24<r 1 


ABC29434 


Homo sapiens 


ghrelin precursor 


611 


[100 


250 


X97631 


Rat tur 
norvegi cus 


carnitine/acyj csrm r :ne 
carrier protein 


246 


1 




W6C9S3 


Hcnu 

sapiens 


Human RIP- interacting tactor 
Rll . 


172 4 


100 

1 


25; 


Y94&73 


Home 
sapient 


Human protein clone HP02632 . 


1676 


100 | 


253 




Homo sapiens 


Amino acid sequence of the 
CDNA clone AIF-2 <HZfcGM49) . 


765 


100 


254 


AE354533 


Lei r.hmania 
major 


possible adenylate kinase 


265 


1 


255 


AF2 3 3 322 


Mus musculus 


zinc transporter like 2 


1916 


3* 


2S< 


Y7S113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEO ID 
NO: 5 . 


2247 


99 | 

1 
i 


257 


AL035539 


Arobidopsis 
thai iana 


putative amino acid transport 
protein 


390 


1 

! 


256 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


100 | 

t 


"25S 


AL075689 


Homo sapiens 


dJ18 7Jll.l (novel protein 
similar to protein kinase C 
inhibitors) 


974 


IOC 

1 


260 


AE000909 


Methanobacte 
riuTr 

Cherrnoautotr 
Ophi cum 


serine/threonine protein 
kinase related protein 




30 


^oj 


AJjU 3 UlJ J 


Homo sapiens 


hypothetical protein 


V £. t> 


IOC 


262 " 


AF019661 


Mus musculus 


zeta proteasome cha-n; PSMA5 i 


1214 


10C 


263 


AL035593 


Homo sapiens 


dJ310J6.l (novel protein) 


821 


10C 


264 


AL022318 


Homo sapiens 


bK150C2.3 (PUTATIVE novel 
protein similar to AP03EC2) 


1072 


10C 

1 


265 


AF2C5940 


Home sapiens 


enoomucm 


1269 


100 i 


266 


AL023583 


Ho^o sapiens 


dJ5001)14.1 (nover protein) 


789 


ioc ; 


287 


AL034S48 


Komo sapiens 


dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprctein CBFW) 


1888 


i 
l 
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SMITH • 
WATERMAN 
SCORE 
16 64 


IDENTITY ] 


( 26 h 


AFlf.l47C 


Homo sapiens 


KSPC121 


-L { 1 




AF161O0 1 Homo sapiens 


I HSPC121 


1232 




( 270 


X90763 


Home 
sapient 


HHa5 hair keratin type 1 
intermediate filament 


21 90 




2 7 2 


AF207600 


homo sapiens 


ethanclamine kinsst 




Toe 


27? 


M3 2 * 3 < 


Homo sapiens 


mt ercel Ju i ar adhesion 
molecule < 


14DG 


101 


2 7"* 




Homo sapiens 


ucpni \c 

i ) O x v.- J. <s " 


663 


^ • 


274 

1 


Y53C52 


Hcmo sapiens 
L - . . - 


Human secreted protein clone 
df202 3 protein sequence SEQ 

TD Nf) • n f 


587 " " n 


30C 




- — ~ t — • 

Y77S76 


Homo sapiens 


Human cytoskeietal protein 
\ri\* ill \ ci one i 1331 aoj . 


762 1 


100 


277 




Homo sapiens 


3 OS ribosomal protein S7 
homol oo 


1269 




27H 


Y 94 9 07 


Homo sapiens 


Human secreted* protein clone 
cal06 1 9x protein seoutince 
*zvc\ r> kid ■ 9 n 


lb J. y 




9t 


2 7^ 


Y 6878? 


Homo sapiens 


MfliiriO CJOIU facmJcIlCtf vj 1 cJ 

human phosphorylation 
effector PHSP-2C 


280] 




2 8C 


275134 


Cam i 

f ami 1 i aris 


roo. trsnEcuci r. 


1 ft1 c 


JUl 


2 83 


Z7 513 4 


Can ii 

f amil iaris 


, 

roc transcucir* 


1/10 


St 


282 


AF24 9873 


Homo sapiens 


muscle-speci i i c protean 


1395 


10C / 


2 8 j 


ALO5O0O7 


Homo sapiens 


hypothetical proteir. 


4 OS 


9t | 


2 84 


AF201531 


Homo sapiens 


DC1 


1859 


-li- — ! 


2 85 


AF156102 


Homo sapiens 


ELL> complex EAP30 subunit 


1318 


■li 1 


286 


Y3S897 


Homo sapiens 


Extended human secreted 
protein secnietice, SEQ ID NO. 
146. 


1250 


i 


287 


1)88964 


Homo sapiens 


HEK4 5 


923 




288 


AL05014 3 


Homo sapiens 


hypothetical piotoiri 


598 


1 ot 


28S 


AJ0H096 


Homo sapiens 


telethonir. 


574 


10C 


290 


Y66724 


Home 
sapienf 


Membrane -bound protein 
PR083 6 




10C 


291 


AF034801 


Homo sapiens 


1 iprin- alphas 


256 b 


9h 


292 


AF0340C1 


Homo sapiens 


1 iprin- a lpha< 


2 590 




293 


7. 1 ftyl ODC1 


Homo sdplens 


dJ889J22B.l (novel protein 
(isoform 1)) 


1738 


10C 


294 


Y7334f 


Homo sapiens 


HTRM clone 8 3 96 Si protein 
seepjence . 


124^ 


9S 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 


44 1 


2 56 






carrier protein- 1 (EMCPl)) 


1024 


71- 


297 


AF1 98532 


Homo sapiens 


lymphoid enhancer binding 


2173 


100 


298 


AF161417 


Homo sapiens 


HSPC29J; 


1147 


81 


299 


AF15914J 


Homo sapiens 


oreast cancer metastasis- 
suppressor : 


1236 


95 


3 00 


U26397 


Rattur 
norvegicus 


inositol poiyphocphat e 4- 
phospha t a s t 


160 


30 


301 


AF036I45 


Homo eapjens 


meningioma -expressed antigen 
5 


3458 


10C 


302 


Z82022 


Homo sapiens 


GlcNac-i-P translerasp 


2067 


99 


303 


AF269232 


Nus musculue 


butyrophilin- like protein 
BUTR-l 


271 


50 


304 


AJ222644 


Arabidopsia 
thaliane 


asparaginyl ~t RNA synthetase 


659 


sc 


305 


AF0S4180 


Home 
sapiens 


hematopoietic cel^. derived 
rinc finger protein 


351 


75 


306 


AJ272 079 1 


Homo sapiens 


APOBEC-l stimulating protein 


3056 1 


10C 


3 08 


Y44486 


Hone 
sapiens 


Human GPRW receptor 
polypeptide . 


1721 j 


IOC 


309 


AJ131891 


Homo sapiens 


DNA polymerase mu 


2598 | 100 
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M'CE? SI ON i S r EC I EE I>E 5 CE . ?T 3 Gh 
NUMEEI< i 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


31C 


AF2 9333b | Home sapiens | p3 0 DBC 


i24e 


s: 


323 


AF176525 | Mus nusrulus [ F-box protein FBL1< 


1501 


c -. 


3 3 ; 


X5780: I Hcir.o sapiens 


- mmunoaJcbulin 1 ambaa light 
chain 


955 • 


t: 


313 


Z36711: 


Homo sapiens 


Ket 


2046 




314 


Ar 161532 


Heme sapiens 


HSPC047 


72 7 | 


100 


315 


AF208O68 


Homo sapiens 


kelch-like protein KbHL3a 


304 6 


300 


316 


Y6666*. 1 Hcm\< 

| sapien:. 


Membrane -bound croteic. 
PPO1013 . 


1166 


100 


317 


Y29666 | Homo sapiens 


Human Ras protein RAPR-i. 


1253 


St 


31£ 


AJ3877 0 | HOTT.c sapiens 


sialin 


2614 


Ct 


31S 
320 


AF161362 | Homo sapiens 


HSPC099 


224 


40 


Y6877? 


Homo sapxens 


Amino acic sequence of a 
human phosphorylst icr. 
effector FHSP-i 


2243 


bb 


321 


AJ23837S 


Homo sapiens 


putative TH3 protein 


303 :- 


30C 


322 


AB040812 


Homo sapiens 


protein kinase PAK_ 


37S; 


9' 


323 


Y550i:^ 


Home sapiens 


Humen secreted protein 
vc4 6__l, SEQ ID NO:6t . 


913 


100 


324 


Y1338D 


Homo sapiens 


Amino acid sequence of 
protein PR0271 - 


I97t 


100 


32b 


Y94944 


Home sapiens 


Human secreted protein clone 
of!57_16 protein sequence 
SEQ ID NO .94. 


230' 


9t 


320 


Y76884 


Homo sapiens 


ket inoblastoma binding 
protein - Vsequencc . 


672€< 


9b 


327 


AF19B5:v; 


Home sapiens 


lymphoid enhancer binding 
factor- 3 


2173 


IOC 


32 8" " 


Z78013 


Caenorhabdit 
is f itoans 


Similarity to Droscphilt 
Cadber in- related tumor 
suppressor 


569 




329 


AF212 92 3 


MUS musculus 


MMTV receptor variant 3 


494 


\ 94 


33 1) 


27 53 3 L 


Home 

sapiens] 
>R6S207 

KbZ>Z VJ / y Z - 

MAR- 3 955 27- 
AUG- 1 953 
Humar. 

stromalin-3 . 
(Home 
sapienr 


nuclear protein SA-3 


64 92 


1 


331 


AL008583 


Hono sapiens 


dJ3 27«J3 6.3 (supported by 
GENSCAN, FGENES and GENEW1SE) 


2 33? 


5 V 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEO ID NC. 
489. 


310 


41 


333 


AJ271669 


Home sapiens 


putative si aloe iycoprot ease 


1747 


300 


334 


AF156598 


Ku9 rr.usculus 


p53 -regulated DDA3 


997 


64 


335 


M99056 


Eimeria 
maxima 


enUOO gene is homologous the s 154 
Eimeria tenella gene etlOO \ 


2t 


336 


Y85564 


Homo sapiens 


Human homoiogue of UNC- 53 
(Hs-UNC-53/1) sequence. 


3386 


en 


337 


YBS564 


Homo sapiens 


Human homoiogue of UNC- 53 
(Hs-UNC-53/1) sequence. 


2602 | 9<. 


336 


Y85S64 


Homo sapiens 


Human homoiogue of UNC- 53 
(Hs-UNC-53/1) sequence. 


3447 ! Sff 


335 


Z6656I 


Caencrhabdit 
is eiegans 


Similarity to Human rabli 
protein (PIR Acc . No. 
A49647) . 


716 




340 


AB021643 


HOflK 

sapiens 


gonadotropin lr.ouciblt 
transcription repressor-3 


2761 


99 


343" 


G01946 


Home sapiens 


Human secreted protein, SEQ 
ID NO : 6027. 


465 


91 


342 


AF020S4: 


Homo sapiens 


zinc finger protein 


1093 


4* 


343 


L2 9154 


Home sapiens 


immunoglobulin heavy chain 


439 


84 



151 
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table: 
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DESCRIPTION 


SMITH 
WATERMAN 
SCCR5 


IDENTITY 




i 


VDJ region ; 




3 4<i 


Ul 026: 


Sus scrofa 


gastric muca;. 


27S 




34?> 


AX0O04CK 


ficmo sapiens 


unnamed protein product 


1177 


91 


346 
"347 


L2 2SS", 


Rattus 
norvcgicus 


caJmodul in-binding protein 


194! 


84 


L22S5", 


Rattus 
norvegicus 


cal modulin-binding protein 


2363 




348 


AL0 4 94 8j 


ArabidopBiB 
thai i ana 


AIGl-like protear. 


31 C 


1 C 


3bO 


AJ^515H 


Mus musculus 


cysteine and hi st idine-rich 
protein 


1460 


5b j 


351 


AKG244 7 7 


Homo sapiens 


FLJ0O07O proten. 


1773 


100 


352 


U&C133 


Homo sapiens 


ankyxin 


SO* 


1 33 


353 


AK000S21 


Home sapiens 


unnamed protein product 


723 


IOC 


354 


AF161420 "| 


Homo sapiens 


HSPC302 


26 23 


97 


355 


AJ0i003^ 


Homo sapiens 


M96A proteir. 


126? 


47 


356 


AF153029 


Homo sapiens 


HSPC19* 


94 2 


91 


35V 


AL02232? 


Homo sapiens 


dJ35SC10.1 (KIAA0027) 


1913 


100 


358 


W/812F 


Homo sapiens 


Human secreted protein 
encoded by pene 3 clone 
HOSB196 


1117 


100 


359 


X034 3 4 


Drosophila 
melanogaeter 


Kr polypept id* 


316 


4b 


360 


AF1S1079 


Homo sapiens 


HSPC24b 


643 


100 


361 


YS3fc8t> 


Homo sapiens 


A suppressor ol cytokine 
signalling protean 
designated HSCOP-6. 


530 


43 


362 


AF2S4741 


Drosophxia 
melanogasrer 


Centaur in Gsmma 1A 


661 


4fc 


363 


AF21346L 


Homo sapiens 


dual oxidast 


201t 


100 


364 ] AF181562 


Homo sapiens 


proSAAS 


131S 


100 


365 


AF161562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U73 2 00 


Mus mueculus 


plI6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


erbb2-inreract jng protein 
ERBIN 


4 973 


95 


366 


U37503 


Mus mus cuius 


laminin alpha 5 chain 


5B67 


72 


369 


AF04369* 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


36 


370 


Y7344C 


Komo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO:J02 


1484 


99 


371 


AF272833 


Homo sapiens 


misatc 


2865- 


97 


372 


AF19H454 


Homo sapiens 


epithelial protein 3ost in 
neoplasm beta 


3927 


100 


373 


Y73 3 4b 1 Homo sapiens 


HTRM c3one 436283 protein 
sequence . 


273 


80 


374 


AFif?or, 1 Homo sapiens 


form! mi not ranst era se 
cycl odeaminasf 


2717 


98 


375 


A953 06 \ unidentified 


RED ALPHA 


1202 


99 


376 


W71fc2fr 


Komo sapiens 


Human secrctea protein 
encoded by gene 100 clone 
HLQA952 . 


ioi: 


99 


377 


Y32131 


Komo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


po. 


132 


86 


379 


AFOS-0934 


Homo sapiens 


PRO0518 


382 


100 


380 


X66 3 6? 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


381 


Y41fc99 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7006 


98 


3 83 


U6460t 


caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


24* 


36 


3 84 


US 01 3? 


Homo sapiens 


ankyrin 


502 


33 


38S 


AJ23e5^0 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


4123 


97 
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ID 
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38 9 
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SPECIES 


DESCRIPTION 


M-n-iH- 

KATERI4AK } IDENTITY 
SCOnI ; 


"Homo sapient 


BM-oo:- 


13-5 \ Sf " 


XS76P . 


Homo sapiens 


immunoglobc j. a n lambda light 
chair. 


7*\ i ~n 


390 


AF1B24C4 


Komo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


S5 


391 


Y85564 


Home sapiens 


Human homclooce ot UNC- 53 
(Hs-UNC-53/1 : sequence. 


33 a 


S7 


393 


AF17843I 


Home sapiens 


SH3 protean 


37CC I 10C 


3 94 


AF229?2f 


Drosopftlla 
melanogaster 


cytopjasmic protein 89BC 


16i< 


6; 


395 ! AF18172: 


Home sapient 


RU2S 


2214 


10C 


396 | Y6919-. 

1 


Home sapiens 


Amino acid $eo-:nce of a 
human betalv- spectrin 
protein. 


162 ( 


9E 


397 ! U4823f: 


f4us musculus 


zinc finger prctein neuro~d4 


74 ; 


6C 


j y o 


Al,3 9013*i 


Homo sapiens 


hypothetical protein 


263 


51 


TOO 
J -s-f 


AF2 17521 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


6C 


400 


AM) 2 2 599 


Schazosaccha 

romyces 

pombe- 


WD repeat prot cm 


44*/ 


27 


4 01 


AC004 6S5 


Homo sapiens 


similar to 2 - cxoglutarate 
dehydrogenase , similar to 
002218 (PID:c2?52618) 


437f 


78 


4 02 


AB01026* 


Mus musculus 


tenascin-X 


1024 e 


62" 


403 
~T6"4 


AL13328* 


Homo sapiens 


dJ671D7.1 (similar to 
D. melanogaster CG5986" 
protein) 


76: 


10C 


26 8 7 53 


Caenorhabdi t 
is elegans 


ZC518 .3b 


88f 


48 


405 


Z78013 


Caenorhnbdir. 
is elegans 


Similarity to nrosophila 
Cadheran- related tumor 
suppressor 


565 


33 


406 


AE03123C 


Homo sapiens 


protean containing CXXC 
domain ; 


11S( 


97 


407 


AF15510C 


Homo sapiens 


NY-REN- 36 antigen 


116t 


100 


408 


Y57945 


Homo sapiens 


Human r ransrr.emi.rane protein 
HTMPN-6S . 


l53fr 


99 


409 


?.1636i 


Ovie ariec 


trichohyalxn 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AJP17652S 1 MUS musculus 


F~box protein FfcXl 3 


2071 


94 j 


412 


AF21084i | Homo sapiens 


HARF 


4 88C 


ioo 


413 


AL03365^ | Hono capiens 
1 


dJ310013.7 (novel protein 
similar to H. roretzi. JIRPET- 
3) 


771 


9b 


434 


X5739C 1 Homo sapiens 


pm5 protein 


613J 


95 


455 


AB02 98 2f 1 Homo sapiens 

i 


3-methylcrotony) -CoA 
carboxylase bi ot in-containir.g 
eubunit 


296 • 


99 


436 


U43502 


Saccharoroyce 
s cerevisaae 


Lphlp 


Hi 


42 


417 


AL16045 * 


Leishmania 
major 


possible t26fiv 21 


239 


31 


418 


¥081 0C 


Homo sapiens 


Human PR0331 piotein. 


330 


29 


419 


Ul513i 


Homo sapiens 


pl26 


222f 


54 


420 


AF11794* 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2361- 


100 


421 


AF1 906 3 b 


Drosophila 
melanogaster 


ankyrin 2 


751 


30 


422 


AF30235C 


Home 
sapiens 


phosphoanosi tol 3 -phosphate- 
binding protein- 2 


196; 


100 


423 


AL137531 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63 753 


Homo sapiens 


son-e. 


7269 


100 


425 


AB02724S 


Homo sapiens 


KAPXK like protein kinase 


1693 


100 


426 


AF2795 44 


Homo sapiens 


tumor endotheliej marker 7 
precursor 


106< 


55 



)53 
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r SFrClES 


DESCRIPTION 


SMI TH - 
WATERMAN 
SCORE 


5 j 

IDENTITY . 

1 


<,2', 


AF279144 


~Honio~i ITa pic ns 


| x Minor endothelial marker 7 
1 pre curve? 


12 5' 


56 I 


Mi 


AE003683 


DrC5ochilo 
melanocaster 


CG6 312 gene product 


14 1 


lb 


Mb 


Y07829 


Homo sapiens 


{ RING linger protein 


2203 


99 


4 30 


AF096897 


Drosophila 
melanoaaster 


pus novei 


4441 


4 7 


43; 


U4H87 


Homo sapiens 


Gu proteii. 


402:, 


99 


43: 


AF023674 


Homo sapiens 


nephrccystan 


37fc; 


100 


4 J.i 


AF146760 


Home 
sapiens 


septin 2-like cell division 
control protein 


22 E 4 


100 


43<i 


AB0066S7 


Arab a dopa is 
thaldr-na 


cleft lip and palate 
associated transmembrane 
protein-like 


B6( 


42 


437 


Y94 24 7 


Homo sapiens 


Human calcium binding protein 
hCBV . 


17 0', 


100 


"43C 


AB040672 


Homo rapiens 


UEP-GolNAc; polypeptide N- 
acetyiaalactosaminyl trans I era 
st 


107: 


63 




AFI 05228 


Bob u.urus 


tuttelin 


28^: 


33 


44^ 


H0b463 


Homo fopiens 


Deriveo protein or clone 
1CA13 (ATCC 4 0553) . 


3071- 


99 


44: 


XI 4 971 


Mus musruJus 


alpha-adaptin (A) (AA 1-S77) 


4897 


98 


442 


X53773 


Ramus 
norveci cus 


alpha-c large chain (AA j- 
938) 


3975 


81 


4 4> 


Vf 6689 


Home 
sapiens 


Men-Jbrane -bound protein 
?R0313t , 


329!- 


99 


444 


AC0677&4 


Arabicopsas 
thaliana 


Unxnovn protein; 2034 8-23707 


114 


33 


4 41: 


AF229032 


Mus musculus 


pi] 


2077 


93 


4 4 f 


AF056035 


Rattur 
norvegi cus 


f.«nexilin 


266^ 


85 


4 4 7 


f AFI32484 


Mus mus cuius 


unKnowi. 


47f- 


51 


441 


W69024 


Homo sapiens 


Polypeptide iraymeni encoded 
by nene 156 . 


52e 


.. | 




AF161445 


Homo sapiens 


HSPC32 7 


160fc 


100 i 


4 St. 


768753 


Caenoxl.abdit 
is elecans 


ZC518.3b 


95i 1 


49 


Tsi" 


W39160 


Homo sap:cns 


Human partial complement 

1 actor H protein fragment 3. 


155 


32 


45> 


We5727 


Homo 
sapienf 


Novel piotein (Clone 
BM46_20; . 


279S 


99 


TsT 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


281C 


100 


454 


D8743fc 


Homo 
sapiens 


Similar to a Celegans 
protein In cosmid C14H10 


4 065 


10O 


455 


AF240468 


Homo sapiens 


Ti) castrin 


3687 


100 


45e 


Z15005 


Homo sapiens 


CENP-> 


1330i 


99 


457 


MES226 


Homo 
sapient 


oamma-eminobutyric acid 
receptor beta -I subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61 1 protein sequence SEQ 
ID NO: 156 . 


96* 


100 


459 


W67E24 


Homo s&piens 


Human secreted protein 
encoded by gene 18 clone 
HSL?tt2S . 


535 


100 


46G 


AF363151 


Homo sapiens 


dentin sialophosphoprotem 
precursor 


279 


19 


463 


D87446 


Homo sapiens 


Similar to a Celegans 
protein encoded in cosmid 
C27F2 (U4041S) I 


9196 


95 


4C2 


G04044 


Homo sapiens 


Human secreted protein, SEQ 1 
ID NO: 8125. 


486 


93 


46:- 


AC0023 98 


Homo sapiens 


F25965_i 


lOlf 


200 


464 


AF064856 


Rattus sp . 


7accmp protein 


1845 


84 


_ 4tS 


AF2234 0B 


Homo sapiens 


E95 


3686 


99 



J 54 
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TABLE 2 



SEC 


ACCESSION 


SFECI F.. c 


DESCRIPTION 


SM1TK- 




j£ 


NUMBER 






WATF.RMAN 


IDENTITY J 


NO: 








SCORE 


] 


"~46 f 


A~F22 34 0e" 


Homo sapiens 




287* 


87 " 1 




AFIO^S 


Kus musciiiuc 


cene trap iocus-13 


63 3t 


91 


46* " 


| JS34 50 


Rattuf- | Gun dinner; nation protein 


15' 


49 " ~ 






jjorvegicnr. 


- JDP-: 








AI..0312 97 


Homo sapiens 


1 dJ97?20.i (novel gene) 


3 56v 


99 


470 f AF2S7077 


Homo sapiens 


eukaryotic translation 


3274 










initiation factor EJF2t 




1 








1 subunit 3 




/ 


473 


L2 62 2 5 


Pocospors 
anserina 


beta transducin-1 ike protein 


284 


3t 

1 


47; 


YB4 903 


Homo sapiens 


A human proliferation r.nc 
apoptocis related protein. 


2~3 3' 


200 | 


473 


AF34423T 


Homo sapiens 


LOMF protein 


251 


44 ; 


474 


Y732:l- 


Homo sapiens 


Human irritable bovel disease 
related polypeptide IKX3 9 . 


83* 


100 j 

: 


47b 


Y95001 


Homo sapiens 


Human secreted protein 
ve!3 1, SEC? ID NO: 52 


34i: 


100 1 

1 


~47€ 


D3 854S- 


Homo sapiens 


ha!025 is new 


653: 


99 , 


4 77 


AF241230 


Homo sapiens 


TAKl-binding protein 2 


365l 


100 1 


~4~t? 1 


1 Ab032534 


Schi zosaccha 

romycet 

pombe 


putative asparagine synthase 


4 8: 


4C 


479 


L2812S 


Podospora 
anserina 


beta transducin-1 ike protein 


233 


26 




AF161544 


Homo sapiens 


KSPC0S5 


434 


77 


481 


AJ2 3 624 6 


Homo sapiens 


cent aurin beta2 


398* 


99 


4~82 


Z3806: 


Saccharomyce 
0 ccrcvisiae 


mslb, 3tal, len: 1367, CA1 
0.3, AMYHYEAST P0864 0 
GbUCOAMYIASF SI (EC 3.2.1.3} 


29i 


23 

j 


483 


AF161381 


Homo sapiens 


HSPC26 


1404 


100 , 


484 


AF223468 


Homo sapiens 


AD021 protein 


131': 


100 


"4 86 


X57527 


Homo sapiens 


alpha 1 (VIIIJ collagen 


416( 


91- 


4 87 


Y2 9062 


Homo sapiens 


39k2 protein 


247! 


10C 


~486 


V73 3 73 


Homo sapiens 


HTRM clone 921803 proteir. 
sequence . 


55!: 


be 


485 


AL021936 


Home 
sapiens 


b34I6.1 (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


"4 90 


X53773 


Rattuc 
norveaicue 


aipht-c large chain (AA 1- 

938; _^ 


467f ; 


97 




U5242* 


Homo sapiens 


\Gor 


14 5>. 


59 




AL3 59773 


Leishmam a 
na jor 


possible threonine synthase 


7D2 


45 


4 9? 


AF226614 


Homo sapiens 


i erroportinl 


2925 


100 


494 


7,93242 


Homo sapiens 


dJ222E3 3.1 (novel protein 
with some similarity to 
Droscpmia kkajvJvNj 


513 


96 


495 


AF036977 


Homo sapiens 


uiiKnown 


1B12 


100 


49f 


U93564 


Homo sapiens 


p4 0 


133 


45 


497 1 Y91405 


Homo sapiens 


Human secreted protein 


357 


100 








ot-^utru^c nu^uueu u y y cue * 












SEQ ID NO:126. 






498 


AF069782 


Drosophila 
roe lanog aster 


*em46-like protein 


653 


43 


499 


V16601 


Homo sapiens 


Human cell-cycle 
phosphoprotein CECYP-2 . 


1656 


98 , 


SOO 


X7C944 


Homo sapiens 


PTB-associated splicing 
tactor 


J tf 0 -5 


1 nn 
J. uu 


502 


AF027503 


Mug 

mueculus 


putative membrane- associated 
guanylate kinase 1 


205 


36 


502 


AF282874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ24 9732 


Homo sapiens 


G6 protein 


669 


100 


504 


AF208861 


Homo sapiens 


BM-01S 


162S 


10O 


50b 


L09706 


Komo sapiens 


complement component C2 


4 022 


100 


I S07 


X66285 


Kue mus cuius 


HC1 ORF 


115 


43 


[ soe 


D0018S 


Rattus 
norvegi cue 


Na-^ , K+-ATPase alpha -subunit 


5227 


99 



JS5 
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TABLE 2 



TV: 

NO; 


ACCESSION 
NUMBER 

Y94T7: 


SPEC I EI j DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY | 


scT 

~5T( 
"61 j 

"sir - " 

52 7- 


Homo sap: ens 


Human secreted protein clout 
fal71_l protein sequence SKO 
ID Ko7l46 . 


2176 


300 


A3019038 


Homo sapienrj 


beta-1,4 mannosyi t r<sr.st era?e 


781 


77 


ABO1903B 


Hcino' sapiens 


beta-1,4 mannosy) transferase 


1347 


100 


A301S03 8 


Hcmo sapiens lbeta-1,4 mannosyl trans i erase 


1520 


95 


X84908 


Homo sapiens 


phosphoryiase kinase 


5729 


99 




XS28S1 


Homo sapiens 


pept idylprolyl i some rase 


650 


76 


51! 


AFT86084 


Home 
sap ien* 


epidermal growth factor 
repeat containing protein 


3046 


99 


Sli 


G03602 


Homo sapiens 


Human secreted protein, SEC j 505 
ID NO: 7683. | 


99 


"Tr; 


U04706 


Bos Lauruc 


50 kDa protein {2749 


77 


sit 


G00G63 


Homo sapiens 


Human secreted protein, SEC. 530 
ID HO: 4734. 


100 


5ii 


AFT61475 


Homo sapiens 


HSPC12* 


1368 


100 j 


520 


Y99366 


Homo sapiens 1 


Human PR014 75 (CNQ7 46) amino 
acid sequence SEC ID N0.81. 


3394 


97 


52; 


AF266852 




PTPLA 


1295 


2 00 


5 2* 


AE0009SS 


Archacoglobu 
s fulgidus 


chromosome segreastion 
protein (smel) 


152 


20 


52 > 


AF06^224 9 


Hcmo sapiens 


immunoglobulin heavy chair, 
variable region 


605 


s? 


52<: 


AJ223830 


Rat t u ; 
norveoa cu^ 


are: 


2950 


SB 


551 
~52t 


W01535 


Home sapiens 


Cellular homologue of th' 
SV40 large T antigen. 


127 6 


! 83 


AF14 5658 


Dros opr. i 1 <i 
melancqr.cter 


BcDNA.GH1022 9 


320 


3 3 


"52"* 


AF112213 


Home sapiens 


putative Rab5- interact inc 
protein 


524 


79 


52f 


D49387 


Heme 
sapicn? 


NAJDP dependent Irukotriene b4 
1 2 - hydr oxydehydrooena ov 


1616 


100 


529 


Y30819 


Homo sapienc 


Human cecrctcd protein 
encoded from 9ene 5 


328 


• 


53 C 


AL079335 


Homo sapiens 


dJ132F21.3 <72,1 KDa protein 
(DKF2P554A032. SBEJ88) 
similar to mouse IFN-gamrr^ 
induce MG11 . ) 


105S 


95 




Y91S06 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 17 9. 


1159 


96 | 


53; 


X7611* 


Caenorhabdi t 
is elecans 


carrier protein (c2) 


S7b 


50 


S3? 


X76116 


Caenornabdit 
is elecans 


carrier protein (c2) 


506 


50 


53-1 


X12966 


Homo eapaens 


3-oxoacyl -CoA thiolas* 
propeptide (424 AA) 


1972 


100 


53L 


Y09267 


Homo sapiens 


f lavln- containing 
monooxygenase 2 


2486 


10C 


1 53 1 


Z11773 


Homo sapiens 


SRE-ZBP 


2201 


99 


637 


DB4224 


Komo sapiens ( methionyl LRKA synthetase 


4741 


99 | 


53t 


D84224 


Homo sapiens 


methionyl tRKA synthetase 


3887 


99 


535 


D84224 


Homo sapiens 


methionyi tRJNA synthetase 


2933 


96 




DB4224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 


54: 


003244 


Poe tourus 


H + ATPase 31kDa subunit tEC 
3.6.1.3) 


848 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-11. 


2301 


99 • 


543 


AF221712 


Homo 
sapienr 


Smad- and Olf -interacting 
2inc finger protein 


2151 


61 


S44 


AEOO0919 


Methanobacte 
rium 

thermoautotr 
ophi cum 


conserved proteir. 


207 


36 


546 


A06669 


syntheti c 
construct 


preTGF-betal ; 2070 


99 



156 



BNSDOCID: <W0 0153312A1J. ? 



WO 01/53312 



PCT/USO«/342ft3 



FABLE 2 



in 

NO: 


ACCESSION SPECIES" 
NUKBER 


EESCRIFTlOt 


SMITH- ■ \ 
WATERKAN | IDENTITY 
SCCKE ' ' 


54 f 


Y0269& 


Homo sapiens 


Human secretec prctel:. 
encoded by qer.o 4 9 clone 
HTFCS6 0 . 


854 | St 

1 


54*. 


AF112205 


homo srpiens 


WSB-1 proteir. 


2275 j 101 


S46 


X602 /I 


Wus mutculus 


c-zel 


2264 i 74 


54 5 


AC016&2 7 


Arabitiopsis 
thai iano 


putative GTPas* 


810 i 
i 


550 


Y7040C 


Home 
sap: cr.f 


Human eel 1 - hi cjna 1 1 inc. 
protein- 1 . 


429 


i 


551 


A304836S 


Homo stpiens 


NEDD4 - 1 ike ublcu^tin ligase : 


8290 


95- 


552 


Y578eC 


Homo sapiens 


Human t ransmembi ■?. ne protein 
HTMPN-4 . 


1112 


9'. 


553 


AF119855 


Homo sapiens 


PROl 84V 


265 


67 


554 


MX 723 C 


Homo sapiens 


MHC HLA- DQ alpha precursor 


1332 


10C 


555 


AL078468 


Arabicopsis 
thalian^- 


putative protei:. 


540 


4 0 


SSt 


AC006S63 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77C 2 '> 
{P:D:a4650844> 


515 




55*5 


AK024487 


Homo sapiens 


FL0O0086 proter, 


1623 


9fc 


55fc 


M1214C 


Homo sapiens 


pol gene protein; Xx> 


117 




555 


W74825 


Homo sapiens 


Human secreted prcteii. 
encoded by gene 97 clone 
HAQDF73 . 


225 


51 1 

i 


560 


X56 68I 


Homo sapiens 


junD protein 


373 




561 


AF003136 


Cae nor ha bo it 
is eleoans 


contains weak similarity to 
an AMP-bmdmg mot:: 


2926 


5* 


562 


AL109839 


Komo sapiens 


dJJ06 9P2.3.1 (novel PAPFCi 
tpoly(A) - binding protein) 


877 


1 100 


563 


AF181640 


Drosophila 
melanoaaster 


BcDNA.GHOSBi; 


289 


t 4: 


564 


AF05272 3 


Felim 

leukemia 

vimr 


goc-pol precursor polypiotein 
gPr80 


1547 


1 41- ! 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28B17 


Homo sapiens 


pt326_4 secreteci protein. 


3338 


10C 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3603 | 9> 


570 


AF155113 


Homo sapiens 


NY- REN- 55 antigei. 


3951 


95 


571 ( AL032821 


Homo sapiens 


d055C23. 1 (vanin 1 : 


1821 


9f 


572 


M691B1 


Homo sapiens 


non-muscle myosin F 


7350 


9 b j 


573 


M691B1 


Home sapiens 


non- muscle myosin L 


7311 


1 st 


574 


Y59678 


Homo sapiens 


Secreted protein 108-008-5-0- 
E6-FL . 


772 


100 


575 


AL36S234 


Arabiaopsis 
thaliant 


putative protexr< 


788 


4C 


576 


AL365234 


Arabiaopsis 
thaliana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpba- subuni t 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


579 


D86984 


Homo sapiens 


similar to yeast aoenylate 
cyclase (S56776 • 


2446 


10C 


580 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


95- 


| 581 

1 


W86812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 56 . 


2339 


99 


j 582 


082319 


Homo sapiens 


novel ORF 


342 


100 


I 583 


P92219 


Homo sapiens 
(human) 


CRi protein 


11425 


95 


584 


A0223948 


Komo sapiens 


RNA he 1 icase 


6608 


95- 


1 585 


Y08612 


Homo sapiens 


8 8kDa nuclear pore complex 
protein 


3874 


99 


586 


Y4 23B4 


Hone 
sapiens 


Amino acid sequence oi 
Iv3l0 7. 


1007 


3 7 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


9E 



157 
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TABLE? 



SEC 


ACCESS J ON 


SPEC2 F.S 


BE"SCRJPTJ()N 


SMjTH- 


I — 


3D 








WATERMAN 


1 IDENTITY 


NO: 








SCORE 




sse 


"AF12177* " 


Homo sapiens 


Unknow:, 


292?- 


CO 


584 


AJ2508C- 


Homo sapiens 


7ESS > 


234f 


:oc ~ 


593 




Homo sapiens 


dj 52 2 07. 2 ( or otioaorria m- 
containina i (similar tc 
peregrin. RR140) 


416'. 


l foe 


592 


L76 57; 


Homo sapiens 


nuclTear hormone receptor 


1355 


i 100 


593 


AF09162; 


Homo sapiens 


PHD finger protein 3 


5*054 


| 300 


594 


X56807 


Homo sapiens 


□esmocollin type 2a 


444} 


| 10C 


595 


AL137BCI 


Homo sapiens 


dJ798Al0.1 '.novel protein) 


212 


; C t " 


596 


AL02232^ 


Home 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2] 


365? 


. 20C 


"597 


AF226 04* 


Homo sapiens 


GIjCOo 


200. L 


i ? c 


596 


A02 7811 * 


Home 
sapiens ] 
^Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 - 5 
protein . 
{Home 
sapiens 


putative cell cycle control 
oroteir. 


335 


1 2" 
I * - 

i 

i 
! 

i 

i 


599 


Y5974Z 


Homo sapiens 


Human normal ovarian tiesue 
derived protein 115. 


1 374 


u 


600 


I.3653i 


Homo sapiens 


inteorin alpha 8 subunit 


5386 


rf- 


601 


Y38458 


Komo sapiens 


Human secretea protein 
encoded by cene No. 20, 


895 


10C 


602 


AF218584 


Homo sapiens 


GGA: 


326S 


JOG 


603 


Y131 j 5 


Homo sapiens 


serine / threcr.i ne protein 
kinase 


5 071 


<-.<; 


604 


AL132 7 76 


Homo sapiens 


dJ393D12.1 (KJAA0776) 


24 i:-. 


9& 


60S 


AL0344S2 


Homo sapiens 


d06B2J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


2 97 9 


100 


606 


Y14494 


Homo sapiens 


aralarj 1 


3461? 


f c 


607 


A70C198: 


Homo sapiens 


OXAU 


2603 


100 


608 


X8609P 


Home 
sapiens 


binds directly to adenovirus 
type 5 ElA protein 


3069 


"OC 


610 


AK163572 


Homo sapiens 


Fotssman glycolipid 
nynthet osc 


18CS 


c C. 


611 


AF161S0? 


Komo sapiens 


HSPC154 


1261 


(. -f 


612 


1,41834 


Ensis minor 


nuclear proteir. 


34 5 


M 


613 


Y919-S4 


Homo sapiens 


Human cytosKeleton associated 
protein S (CYSKP-9) . 


3668 


: oo 


614 


(" AL0223 2 7 


Homo sapiens 


dJ355Cl8.i (KIAA0027) 


[26] 


- 


615 


X 85 76 6 


Homo sapiens 


binding :cgu"jatory factoi 


3203 


-j 00 


616 


Y08319 1 


Homo sapiens 


kmesin-i 


3487 


c; 


617 


[ D12644 


Mus musculus 


KXF2 proteir. 


3609 




618 


U28765 


Kuc musculus 


PACT 


5936 




619 n 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


1684 


c < 


620 


AB046382 


Mus musculus 


test ls-sbundant finger 
protein 


199 


: : 


621 " 


Y00O62 


Hcmo sapiens 


precursor polypeptide {AA -23 
to 1120) 


3440 




622 


AF0662 86 


Homo sapiens 


HDCMD38P 


86) 


1 00 


623 


X9824& 


Homo GBoiehs 


sortilin 


4436 




624 


X61100 


Homo eapiene 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


c. <. 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


c o 


626 


AF151027 


Homo sapiens 


HSPC19 ♦ 


5e2 


, 5> 1 


627 


X1496B 


Homo sapiens 


Rll-aipha subunit (AA 1-404) 


207S 


j 00 


'62T " 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7__l derived protein 


1963 


1 GO 



158 
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table: 



SEC 
2 - 

NO: 


ACCESSION SPECJES 


CESCK1P7I0N 


S KITH - 
W/»TERMAN 
SCOR! 


x 

j IDENTITY 


625 


Y50513 j homo sapiens 


Human tetal brain cDNA clone 
vb7_l cenvea proteir. 


16S< | 30t 


6 3> 


AF0 9876' j Hcm : 

| sapiens 


17 beta-hycrox/steroic 
dehydrogenase type VI i 


175 4 | JlH 


633 


AI.03455S fHomc 

1 sapiens 


dJ1340iP.3 (zinc finger 
prctem 151 (pHZ-67) ) 


4 271- | 10( 
i 


632 
633 


W74 826 j Homo sapiens 
i 
1 


Human secreted protein 
encoded by gene 98 clone 
HA0BT94 


794 


S( 


AF28828fc 


Home sapiens 


HPT proteir. 


223* 


IOC 


634 


AF041425 


Homo sapiens 


pRGRl 


82.-V 


95 


635 


X6635'/ \ Homo sapiens 


serine/threonine protein 
ki nasf 


3585 


10C 


63C 


Y112 64 | Homo sapiens 


AFX1 


257'i 


9f 


63 V 


AH0048B4 


Homo sapiens 


PKU-alphc 


371£ 


95 


636 J AJO023O3 


Homo sapiens 


cynaptocyrin lc 


102C 


10C 


635 


AJO023O4 f Homo sapiens 


syTiaptogyrrn Ih 


" Too"; 


acc 


64 C 


AJ002303 


Homo sapiens 


synaptogyrm lc. 


93.- 


94 


64 3 


D87602 


homo sapiens 


similar to a Celecaru 
protein encoded in cosmic 
T26A5 . 


26 71 


10C 


642 


Ml 4 66 C | Homo sapiens 


ISG-K54 


24 7? 


95 


643 


X06661 


Homo sapiens 


caTbindir. (AA 1-261) 


13 5f 


100 


644 


AF1 19900 


Homo sapiens 


PR02622 


185 


li 


645 


AB031048 


fcrosophila 
melanogaster 


microtubule associatec- 
protein orbdv 


73t 


2 V 


646 


AF250842 


Drosophi la 
melanogaster 


mjltiple asters 


834 


25 j 


647 
64b 

"649" " 


X86691 


Homo sapiens 


Mi-2 protein 


101 K 


9^ 


U67934 


Homo sapiens 


44.9 kDa protein C18BU 
home log 


8 27 


9fc 


AF236061 


Oryctolagus 
cuniculus 


RING-tmger binding protein 


3 53C 


93 


650 


AL03 4 553 


Homo sapiens 


dJ914P20.2 (XIAA0784 protein 
similar to Kus muscuius 
act i vi ty- dependent 
neuroprotective protein 
(Adnp) ) 


570* 


100 


653 


X14766 


Homo sapiens 


GA3A-A rr.crptor alpha 1 
subuni t 


238f 


* 1 


654 


AC004614 


Homo sapiens 


similar to i-spendin prot.pins- 
AB006086 (PID:g2529225) 


302( 


99 


655 


Y57906 


Homo sapiens 


Human transmembrane protein 
HTMPN-32 . 


60*< 


99 


656 


234975 


Homo sapiens 


IdlCr 


3 733 


10C 


658 


AL050306 


Homo sapiens 


d0475B7.2 (novel protein) 


1 54 : 


99 


659 


W76734 


Home 
sapiens 


Human nvDia Rho targeting 
proteir, 


7 8 j 


3 4 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domairi protein 1 


2172 


1 100 


661 


221966 I Homo sapiens 


uvPOU homeonox protein 


1529 


100 


662 


AJ24 2954 | Mus muscuius 


dysferlir. 


475i 


59 


663 


AF182316 ( Homo sapiens 


rayo£erl ir. 


6232 


95 


665 


AL161516 


Arabidcpsis 
thaliana 


hypothetical protein 


209 


3D 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y133S5. 


Homo sapiens 


— _ . -m — 

Amino acic sequence ot 
protein PRC-22C . 


3692 


IOC 


669 


AB010692 


Arabidcpsis 
thaliana 


contains similarity to endc- 

beta-N-acetylglucosaminidase 

gene 


612 


52 


671 


X56123 


Kus muscuius 


talin 


4474 


76 


672 


AB039371 


Hcmo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF26 9223 


Homo sapiens 


TCP11 


80C 


42 


674 


AF229633 


Kus muscuius 


grouchorreiated protein 4 


4053 


99 


675 


L14 4 63 


Hattus 


' transcuciii 


3619 


95 



159 
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TABLE 2 



SEQ | ACCESSION 
ID ; NUMBER 
NC: 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTIFY 




norvegicus 




1 


67( | AC0CS757 


Homo sapiens 


R32612 1 j 277<- 


IOC 


677 j SfiOOi 


Homo sapiens 


reverse trnnscriptcse 
hcmoloy=?ol (retroviral 
element) 


• 2b- 


6b 


676 j AF271388 
i 


Homo sapiens 


CKP-N- acetyi neuramini c acTcf 
synthase 


22 "i . 


100 


675- 


X7906C 


Homo sapiens 


FRF- : 


17f:- 


100 


680 


AF118566 


Mus tnusculus 


hematopoietic zinc linger 
protein 


765 


5C 


683 


Y53 41 r . 


Homo 
sapiens 


Human wild type pKe83 
protein . 


262j 


99 


682 


AM33S45 


Komo sapiens 


bA3 86N14.1 Inovel protein 
similar to a dual specificity 
phosphatase) 


70C 


68 


683 


Y86214 


Homo sapiens 


Njclear transport proceir. 
clone hfb34l protein 
seouence . 




95 


664 


Y94 95: 


Homo sapiens 


Human secreted protein clone 
£hll6_ll protein sequence 
SEQ ID NO: 110. 


35'. 


96 


68b 


AL021878 


Homo sapiens 


OJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isclorm 2 ) ) 


15< 


67 


686 


AE000196 


Escherichia 
coli 


or! , hypothetical procem 


62! 


100 


68" 


MS837fr 


Homo sapiens 


synapsin I 


3 73( 


95 


'~68h 


AF03 96 97 


Homo sapiens 


antigen NY -CO 3) 




96 


| 6 8S- 


U093<>e. 


Oryctolagus 
cuniculua 


protein phosphatase 2A1 p 
gamma nubunit 


23Bt- 


99 

l 


[~69e 


AF355106 


Homo sapiens 


NY- REN- 36 antigen 


26! 


SC ! 


!" 65. 


AC004774 


Homo sapiens 


Dlx- I 


154; 


10C 


692 


X90530 


Homo sapiens 


raop 


192* 


99 


6s:- 


X9O530 


Homo sapiens 


ragi- 


140i: 


95 ( 


694 


X90S30 


Homo sapiens 


racji 


159C 


85 ■ 


esb 


GO 3 56? 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 5644 . 


33C 


IOC 1 
1 


696 


AC031830 


Araoidopsis 
thai iana 


Putative methionine 
aminopeptidose 


665 


52 | 
1 
I 


6 97 


AJ25042S 


Rat tus 
norvegicus 


Collybistin : 


245L 


98 j 


696 


ABO37901 


Homo 
sapiens 


gene amplified in squamous 
ceil carcinoma- 1 


5364 


99 


69<- 


YS94 0. 


Homo sapiens 


Human PR013 27 (UN0687) amino 
acid sequence SEQ ID NO:21G. 


1361 


IOC 

1 


r:_j 


AF221 712 


Homo 
sapiens 


Smed- and Olf - interacting 
zinc finger protein 


6705 


10C I 

1 


[702 


xe?57:< 


Homo sapiens 


ARSE 


3184 


99 j 


1 70:* 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 | 


1 704 


Y73261 


Homo sapiens 


Human chondromodul i n- 1 i ke 
protein, Zchrel . 


3.697 


94 


| 70b 


Y7126I 


Homo sapiens 


Human chondromodul in- like 
protein, Zchml . 


1736 


99 


706 


Y412S7 


Homo sapiens 


Amino acid sequence of long | 1060 
human FA3M. i 


100 


707 


AL022237 


Homo sflpicns 


bK119lB2,3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


1 00 


708 


AJ006266 


Homo sapiens 


AKD-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 5652. 


777 


99 


7a o 


Y0869t 


Homo sapiens 


ranbp:- 


2849 


98 


713 


Y6877G 


Homo sapiens 


Am no acid sequence of £ 
human phosphorylation 
effector PHSP-2 . 


7 5"4 


95 H 

1 
t 
\ 



160 



BNSOOCID < WO 01533l2A1_Lv 



WO 01/533! : 
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fable 2 



<t"eo 
11 
no. 


ACCESS JON 
NUMBE K 


SPECIES 


DESCR1 PTJ ON "1 


SK3TH- 
KMERMAK 
SCORE 


IDENTITY 


71 ; 


U93b7<- 


Homo sapiens | putative plSC j vs; 


5 9 ! 


71" 


AC0Q453 J 


Homo sapiens 


Gene with simiiaity to L-EAT 
box he 1 leases 


271! 


! 


714 


PC 901 <. 


Homo sapiens 


Neuroblastoma 


531 


,48 " - 


71: 


V52171 


Homo sapiens 


Hunr.an cardiovascular system 
associated protein tyrosine 
phosphatase 2 - 


f/34 


9B 

; 
I 


7ie 


AL13 7 013 


Homo sapiens 


bA3HP8.3~ (probable uracil 
phosphoribosyl t rar.terase ) 


oc; 


100 | 

! 


717 


AK035123 


Mus nusculus 


GDI alpha/GTia alpha/GQlb 
alpha synthase 




93 


71c 


Y9629C 


Homo >P4 0254 
P40254 2S- 
OCT-1984 09- 
APR- 1983 
Human IgD. 
(Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 




85 

1 
i 


71 £ 


X07975* 


Homo sapiens 


intecrin beta 1 subunit 
precursor 


"4 34 7 


9S ! 

i 


72( 


AJ224 81 9 


Homo sapiens 


tumor suppressor 


214.* 


95 


72 j 


Y0759b 


Homo sapiens 


transcription factor TFJ1K 


2373 


100 , 


72; 


W41565 


Home 
sapiens' 
>W41564 
W41S64 08- 

APR-199f 

Human 

calpain. 

sapiens 


x.uinrfn Caipoin . 


1591 

[ 


99 


723 


AFJ61341 


Homo sapiens 


HSPC07* 


1097 


98 


72< 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


72 i 


AC006708 


Caenor ha bdi t 
is eleaans 


contains simlority to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP33 
(GB:Z72B7€) 


114? 


46 


72 ( 


AC006708 


Caencrhabdit 
is eleaans 


contains oimlnrity to 
Saccheromyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


98e 

1 


46 


727 


ACC248ie 


Caenorhabdit 
is elecans 


contains similarity to Ptani 
family FF00400 (WD domain, 
G-beta repeat), ecore-81,6, 
E«1.4e-20, N-3 


950 


44 


72£ 


AJ005697 


Homo sapiens 


JMb 


831 


4 7 


72^ 


Y4S377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 

27 . 


90fc 


97 


730 


G03933 


Homo sapiens 


Human occreted protein, SEC 
ID NO: 8012. 


57& 


100 


73 j 


AF01272O 


Oncorhynchus 
ma sou 


GTP-bindmg protein 


386t 


j 


73 2 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No. 8. 


862 




•73 3 


G02t50 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 6731 . 


644 


97 


734 


AC024 813 


Caenorhabdit 
is elegsns 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


73S 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


73* 


U0003? 


Caenorhabdit 
is elecans 


similar to S. cerevisiae YJU2 
proteir. 


60b 


41 


737 


AF079098 


Homo 
sapiens 


argmine- tRNA-prctein 
tranelerase 1-lp; ATEl-lp 


2733 


99 



161 



8NSDOCID <WO 01S3312A1J, ? 



WO 0 J ,'533 J? 



FCT/US00/34263 



TABLE 2 



SEC I ACCESSION 
ID ! NUM13EK 

NO . 


SPEC IE 5. 


DESCRIPTION 


SI* I TH- 
EATER MAN 
SCORE 


i % 
IDENTITY 


73 £ j AJ131712 


Homo sapiens 


nucleolar RNA-helicasf 




100 


73 5 j AJ133I15 


Homo sapiens 


TSC- 22- like proteir. 




9! I 


740 | X98256 


homo sapiens 


M-pr;ase pnesphoprotem 9 


f 95: 


100 | 


743 
"74"? 


X982S* 


Homo sapiens 


M-pnase phosphoprotein . c 


564 


74 , 


US7191 


Coenorhabdit 
is elegans 


Etronq similarity to the YPT1 
sub- family of RAS protein? 


96 1 


8f 


74 1- i X76057 


Homo sapiens 


phespnomannose iscmerast 


219: 


100 


744 ) G03205 


Homo sapiens 


Human secreted protein, SEC 
TD NO: 7290 . 


496 


96 


|74 5 j X97064 


Homo sapiens I Sec23 protein 


4034 


9S I 


746 | W93946 


Homo sapiens 


ilun-.cn regulatory molecule 
HRM-2 protein. 


994 


100 


74 ~ 


Y7338& 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


156b 


99 


74 E- i Ml 9529 


Sue ecrofa 


roll istatin A 


1906 


96 


74? 

"7 50 


AJ249457 


Trichomonas 
vaginal is 


cent r in , putative 


163 


26 


AC004410 


Homo sapiens 


fo£?9SS4_; 


2094 


100 1 


76, 


AF074 96 8 


Homo sapiens 


D471NG3 protein 


2167 


100 i 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 | AB04 9629 
1 


"Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


im 




764 | D7920S 


Homo sapiens 


riboeomal protein L3S- 


16C 


7', 


755 


AB008430 


Homo sapiens 


CDEV 


14 i 


26 


756 


[ 1.32162 


Homo sapiens 


transcription facto: 


574 


80 


756 


AF037204 


Homo sapiens 


RING zinc finger prcteir. 


295 


54 


76C 


Y44250 


Homo 
sapient 


Human cell signalling 
protein- 13 . 


625 


100 


76: 


AF218586 


Homo sapiens 


Cirie-b 


1136 


100 


76. 


U38934 


Gallus 
galluf 


hi st one \l2h 


625 


97 


76 j 


AF226053 


Homo sapiens 


HSKM-F. 


606 


3i 


7 64 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


10C 


765 


De7446 


Homo sapiens 


Similar tc a C. elegant 
protein encoded in cosmic 1 
C27F2 (U40419) 


566 


36 


766 
"767 


AL023828 


Caenorhabdi t 
is elegant 


Y17G73. 14 


20C 


27 


Y82777 


Homo sapiens 


Human chorcin related protein 
<Clone dw€€5 4) . 


2561 


95 


76e 


XS2475 


Homo sapiens | 1TBA2 


1429 


100 


769 


Y42752 


Homo sapiens 


Huni^n calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


2641 


97 


771 


7s0006591 


Homo sapiens 


cysteme-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rapi 


935 


100 


772 


212173 


Homo sapiens 


N-acetylgiucosamine-fc - 
sulphatase 


2S70 


100 


77 4 


Y91950 


Homo sapiens 


Human cytcskeleton associated 
protein 5 (CYSKP-5) . 


566 


43 j 


77* 


AL023799 


Homo sapiens 


dJ322P7.1 Uinc finger) j 


855 


56 J 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 | 


77E 


G01880 


Homo sapiens 


Human secreted protein, SEC- 
ID NO: 5961. 


84 5 


QC ! 


779 


AO01259O 


Homo sapiens 


glucose 1 -cehydrogenast 


4155 


95 


78C 


AL078S82 


Homo sapiens j OJ130E4.2 (KIAA0796; 


1323 


66 


781 


Z75955 


Caenorbabdit 
is elegans 


similar to mitochondria.* 
carrier protein 


384 


34 


782 


AL109965 j 


Homo 
sapiens 


dJH21G12.2 (SCAN domain- 
containing 1 protein) 


90C 


1C0 


783 


AF061262 


Miis 

musculuc 


semsF cytoplasmic domain 
associated protein I 


1316 


83 


784 


G03373 


Homo capiens 


Human secreted protein, SEO 


645 


95 



J62 



BNSDOCID: <WO__0153312A1J_ 



WO 01/53312 PCT/USM/J4263 

TABLE 2 



SEO 
ID 

; NO-. 


ACCESSION 
NUMBER 


SPECIE! 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






ID NO: 7954 . [ 






V8444: 


Homo sapiens 


Air.ino acic sequence of i 
human RNA- assocja t ec 
protein . 


2074 


100 

i 


• 


Y00916 


Homo eapieno 


Human Rab protein, RAB?-- . 
protein sequence • 


1048 


SB 


787 


Z97029 


Homo sapiens 


ribonuclease HI laroe subur::t 


1548 


39 j 


7 86- 


AB03S384 


Homo sapi ens- 


SRp2S nuclear protein 






789 


"AF024621 


Homo sapiens 


ang: 


264< 


i 100 i 


790 


AJO06710 


Rattus 
norvegicui 


phcsphaticylmositoi 3 -kinase 


4508 


57 1 

| 




V00638 


bactenophag 
e lambda 


reading frame ealC 


600 


100 ' 

I 


- 

i 


70F0491C3 


Homo sapiens 


Huntingtin :n:eractinp 
protein 


819 


100 




Z2G317 


Homo sapient 


dcsmoglein 2 


4 810 


99 j 


7 9 6 


Y 76 864 


Home sapiens 


Opt i rifihl h omA hirdi no 

protein-7sequence 


5C80 




7 97 


Ul 53 r> 1 


GalJus 
galluf 


l i yi->lf x rjcye/i 


3 72 


37 


796 


U97181 


Caenor habdi t 
is elegan; 


strong similarity to thw 

D1 / d14 f »tk i 'i v r> f W 1 rta cf>< 
trxJ/rX'* IPWiJ.±y OI fWrjCTS>C 


227 


26 


799 


AF1 12201 


HfVnn cam one 


neuronal protein NP2S 


1053 


100 1 


P 0 C- 


Ar >/J4 /bL 


Rat tus 
norvegicus 


seri ne- argi nine - r 1 eh cplicino 
regulatory protein £RJRP8t 


DEC 


6" i 

1 

i 




»r^<: 7 o c: ^ 
Af ^ to / Od<; 


Homo sapiens 


placental protein 13-likf- 
prot e in 


743 


QCi 

1 




AF208851 


Homo sapiens 


Btf-OOS 


766 


B0 1 


8 0 1 

i 


/.o ± \jy ' 


L<Hr 1 JUI lldiJU, L 


Similarity to Human 
retinoblastoma -bind inc 
protein RBAP46 yk662dl2.' 

v^, \jy \ f\_ o i. i wilt km II A o ML. J i r 


152 


27 i 

1 

1 
1 


eo4 


G0211? 


Homo sapiens 


Human secreted protein, SEC 

ID NO • 6194 


496 


96 




Al.l 21 6 73 


Homo sapiens 


bA305P22.1 {novel protein) 


1160 


ico ; 


b 06 


AC0134B3 


Arabi oops i t> 
t ha liana 


putative GTPase activator 
protein 


264 


3C 1 


6 0 7 


AC0134B3 


Arabi cops i i 
thaliana 


putative GTPase activator 
protein 


264 


3C i 


fi06 


AB0138Bb 


Homo sapiens 


beta - ure i depropi ona s< 


1494 


100 f 


PO 1 ; 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


1 


6" 1 C 


AF1 61423 


Homo sapiens 


HSPC3 03 


2134 


96 l 


611 


AF261689 


Homo sapaens 


DNA polymerase epsilon pi 7 
subunit 


734 


100 


ti2 

i 


Z74029 


Caenorhabdi t 
is elegam 


Similarity to C.eieoans 
alcohol dehydrogenase comet 
from this gene 


610 


73 

t 


i 


Z73497 


Homo sapiens 


CU24 0C2.2 (Core hist one 
H2A/H2B/H3/H4) 


324 


100 


| 814 


W876B9 


Home 
sapiens 


Human HTXFT1 9 polypeptide. 


1484 


95 


ei5 


X16282 


Homo 
sapiens 


zinc finger protein 1217 AA', 
(1 is 2nd base in codon) 


1109 


99 

i 


816 


Z92539 


Mycobactenu 
m 

tuberculosa s 


ptn 


300 


36 


1 

1 








- — i 


I 618 


AB030483 


Mus musculus 


B9 


197 




PM9 


AL11755S 


Homo sapiens 


hypothetical proteir. 


321 


194 


[ E20 


AC005326 


Homo sapiens 


R26660_2, partial CD< 


865 


97 | 


j 821 


G039S1 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 8032. 


700 






L34807 


Mueca 
dome at i cc 


transpoease 


174 


20 


823 


GC2928 


Homo sapiens 


Human secreted protein. SEC- 
ID NO: 7009. 


558 


76 


624 


Z99531 


Schizosaccha 


caff eine- induced death 


184 


29 



163 
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TABLE 2 



SEQ 
ID 

NO r 


ACCESSION 
NUMBER 


SPEC! f.i ■ DESCR 1 FT] C-N 

1 


SM1TK- 
WATERMAN 

SCORE 


IDENTITY 






romycej 

OOrtVD' 


protein 2 




1 


825 
~B26 


AJ006692 


Home sapK-ns 


ultra high suiter keratn. 


6 9:- 


1 6i t 


U23037 


Oryctolagus 
cum cuius 


elF- 2Bepsi Jori 


340t 


9i ~ j 
1 


827 


G03412 


Homo sac;ens 


Human secreted protein, SEQ 
ID NO: 7493 . 


464 


2 0C 


928 


Y30327 


Homo sapiens 


Human secreted pioteir. 
encoded from gene l'# 


113 


\ 


829 


Y32199 


Homo fifipjens 


Human receptor trcJecule (REC) 
encoded by Incyte clone 
2022379. 


1C27 


20( 1 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


c c 


832 


AB01154i 


Homo sapiens 


MEGFS 


2097 j 10C 


833 


GC2639 


Home sapient 


Human secreted protein, SEQ 
ID NO: 6720. 


223 | 70 


834 


AF119664 


Homo sapiens 


transcriptional regulate: 
protein HCNGF 


1574 


20C 


835 


AK119664 


Homo sapiens 


transcriptional regulate) 
protein HCNGV 


1144 


8i 


836 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1446 


94 


037 


X12527 


Home sapiens 


C protein (AA 1-1 55. 


918 


10C 


63b 

I 


1)3286^ 


Drosophilc. 
melanogaster 


lmotte protein 


164 


2 1> 


839 1 AF067730 


Homo sapiens 


TLS-associatcd protein 1ASR-2 


631 


5» 


840 


U27831 


Home sapiens 


fltr.iatum-enr:ched phosphatase 


2840 


9t 


84 2 


AF286366 


Homo sapiens 


CamKl-like protein kinase 


1796 


ICC 


842 1 GO2309 

843 ( AEG03615 


Home sapiens 


Human secreted protein, KKQ 
ID NO: 6390. 


276 


9k 


1 


DrosopTvi la 
melanogaGter 


aae3 gene product 


113 


4* 


84 4 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 54 31 . 


629 


ICC 


845 


U2783e 


Mus musculus 


glycosyl -phosphat adyj - 
inositol -anchored proteii. 
homolog 


3305 


9( 


847 


Y87788 


Homo eapicns 


Human RBF-26 protein 


2026 


10C 


848 


AF164794 


Homo sapiens 


Piff33 protein nonioioc 


2398 


10( 


849 


U41315 


Homo sapiens 


2NF127-XF 


2458 


9s 


850 


AF192784 


Komo Sc.pl ena 


makorin 1 


2062 


97 


851 


Y58628 


Homo flspicns 


Protein regulating gene 
expression PRGE-2i. 


154 8 


10( j 


852 


222968 


Homo sapiens 


M130 antigen 


6205 


IOC 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


10C 


) 854 


G03362 


Homo sapiens 


Human secreted protean, SEQ 
ID NO: 7443 . 


33C 


9fc 


I 855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443 . 


203 


10C 


{ 856 


AF28S1J.8 


Homo sapiens 


CGI- 201 


4S2 


I 10C 


j 857 


AC006069 


Arabidcpe i b 
thalioni. 


putative clesvage anc 
polyadenylat ion specif ity 
factor 


1383 


5b 


| 858 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1i 


593 


10( 


659 


L02956 1 Xenopue 
| laevis 


ribonucleoprotem 


1664 


8b 


860 


AF202 94 7 I Komo sapiens 


MEK binding partner 2 


616 


10C 


862 


L31783 


Mus musculus 


uridine kinase 


1266 


97 j 


R62 


AF161472 


Homo sapienc 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is cleoans 


mitochondrial carrier protein 


370 


43 


864 


AF1S4108 


Homo sapiens 


tumor necrosis factor type i 


3559 


99 



164 
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TABLE 2 



SEQ 
ID 

MO: 


/'.CCF.SS3 0N 1 SFECIES 
NUKEEK ! 


DESCRIPTION 

- 


SMITH- 

waterman 

scoki: 


IDENTITY 






receptor ai;soc:oted protein \ 




AE00153 0 


Hel icooacter 
pylori J99 


put at 3 v* 


23C 


-> . 


866 


X57PC*. 


Homo sapiens 


i mmu nc? 1 obu i ) n? ) 5 mods i i gb t 
chain 


€9S JS. 


8 67 


AL031673 


Homo sapiens 


dJ694r^4.1 ( PUTATIVE novel 
KRAB box pro:ein with 18 C2H2 
type Zinc finoer domains] 


406( 


99 i 


966 


Y116SS 


Homo sapiens 


phosphate cyciss* 


23 y 


TOO 


865 


A?19256fc 


homo sapiens 


high-p; ucose- recuiated 
protein * 


3041 


99 


670 


AB 02064* | 


Homo sapiens 


KIAA0841 protcir. 


323'/ 


S5 


871 


AL03142' 


Homo sapiens 


dJl67A19.1 (novel protein) 


160* 


10C J 


872 


A? 151534 


Homo sapiens 


core hi stone macroH2A2.; 


186* 


100 


873 


AL02133: 


Homo sapiens 


o\j3 6 6N23.1 (putative C. 
elegann UNC-93 (protein 1, 
C46Fi:.l) hJKE protein) 


1125 \ 10C 


074 


XI 4 60f 


Hcmo sapiens 


pxopi env 1 ~ CoA Co rboxyi ase 


35 71- 


IOC 


87S 


Al.l 1733 4 


Homo Kaci^ns 


dO 6 8 7 Fl 1 . J (novei protein 
(part of translation of cDNA 
DKFZp434N06 1 , Em : AL110249 ) ) 


30* 


IOC 


976 


X 7 9 4 £ 5 


S a o cha r otny c e 
s cerevisiae 


E-925 croteir 


44* 




877 


Y530O2 


Homo sapiens 


Human secreted protein clone 
dn834 i protein sequence SEO. 
ID NO~:f 


81 1 


10C 


878 


M"2 8 a 064 


Homo sapiens 


CHMP1 . ? 


9S", 


loo 


879 


X7 941'/ 


jug scrofo 


40S riboscmaj orctein S12 


68"' 


100 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soi lp 


47f 


2f 


8 81 


y g 7 2 '/ J 


r* ^ ' 1 1 \ \ } JO l LM%) 


Hums n ^ i ina 1 nf-nr i ilt 
containing protein HSPF-52 
SEO ID NO: SI 


254% 


10C 


882 


Mi <1 01 6 

I J 1 i V J C 










883 


AJB04:26: 


Homo sapiens 


calcium- independent 
phosphol ipase Ai 


290? 


10C 


884 


AF02031 j 


Mus musculus 


pro3 ine - r/icri protein 48 


999 


84 


8 8 5 


Yi 093* 


Homo sapiens 


hypothetical proteir. 


1104 


99 


886 


AF073997 


Mus musculus 


myot ubu 1 ariri relatecl protein 
1 


866 


31 


88 7 


YS78S? 


Home sapiens 


Human t; bnsmeir.brane protein 
HTMPN-}'- 


109S 


94 


888 


A.L1 1763b 


Homo sapiens 


hypothetical protein 


929 


95 


88 S* 


AF210317 


Homo sapiens 


facilitctive oluccsc 
transporter family member 
GLUT 9 


204t 


oc 


890 


Y3 6 03 3 


Homo sapiens 


Extended .Tuman secreted 
protein secruence, SEO ID NO. 
416. 


583 


100 


893 


Y:-6 03 3 ^ 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


5-/ 


892 


AF237633 


Homo sapiens 


ubiquitous t rcpomoculin U- 
Tmod 


1798 


10C 


893 


AF090929 


Homo sapiens 


PR0047*/; 


653 


9S- 


894 


ALC31226 


Homo sapiens 


dJ1033B10.2 (WD4 0 protein 
BING4 (similar to £ . 
cerevisiae YER082C, M. sexta 
MNG10 and C. eiegans F2 8D1.1) 


3196 


10C 


895 


AL03J226 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S . 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 


9t 


896 


AF171102 


Homo sapiens 


retinal ceqenerat ion B beta 


1302 


9£ 


897 


AE003SS1 


Drosophila 
melanogaeter 


CG18176 cene product 


633 


33 
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TABLE 2 



S£U j ACCESSION 
ID j NUMBER 
NO: ;' 


SPECIE DESCRIPTION 1 SMITH - 
\ | WATERMAN 

SCOKL 


* 

IDENTITY 


B3£ 


Au23794* 


Homo sapiens DEAD Box Protein S I 244 • 


1 OC 


Z9716', 


Homo sapiens ; EKEI i 6 24 


100 


900 


Z971C< 


Homo sapiens f KKF.l 1 4 0^ 


95 


901 


AJ24SS87 


Homo sapaens ) Kruppei -type zanc finge? 


194; 


100 


90i 


AF021034 


Homo sapiens 


GTP-binding protein RAB22A 


10:: 


100 


90? 


K9S9S;- 


Homo sapiens 


Eukaryctic cell growth 
inhibiting tactor. 


414 


96 


904 


L047 3!: 


Hcroo sapiens 


kmesin light chain 


1936 


72 


905 


AE0O354C 


Drosopniia 
melanocaster 


CGI 0984 cene product 


441 


33 


90* 


MSSS4; 


Homo sapiens 


guanylate binding protein 
isoform ; 


2993 


98 


907 


MSS54; 


Homo sapiens 


guanylate binding protein 
isotorm ! 


290: 


96 


90t 


W8 4 0fl. L 


(IK JUILJ OO^UtllO 


Hnrnj^n -n^nihiT.a tip ft)Cir>n nrnf pin 

it UNIC« ( * rM-^ u\JJ A. dilf A. kA ^> X KJi 1 A \J k* A I 1 

WDProi . 


188f 


100 


909 | AF268676 

i 


Homo 
sapienf 


TNF intracellular domain- 
interacting protein 


647 


100 


91C 


AB029150 


Komo sapiens 


KRAB 2inc finger protein 
HFB101L 


21 ?f 


100 


91j 


G0287J 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 6951 . 


52: 


100 


9i: 


G0316: 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 7242 . 


3 87 


"87" 


913 


AJ24372: 


Homo 
sapiens) 
>Y9250e 
Y92508 13- 

h pri 1 Ann n r 

ArH-^UUU U b - 
OCT- 1998 
Human OXRE- 

sapiens 


d7DP-4-keto-6-deoxy-D-gltirose 
4-reductas*: 


17l( 


100 


914 


U241B9 


C^^norhabdit 
is elegans 


Dypot £ ^ j prot ei n 1207-1; 
Method: conceptual 
translation supplied by 
author? 


2"4/, " 


41 


915 


Y02591 


Komo sapiens 


A human progesterone receptor 
complex p23-like protein 


843 


99 


915 


AE000984 


Archaeogl obu 
s fulgidus 


dim trogenase reductase 
activating glycohydrol ast 

(draG) 


171 


26 j 


918 


M231S9 


Crlcetus 
cricctu* 


DHFR-coomplif ied protein 


163 


30 


919 


L12016 


Caenorhabdit 
is elegans 


putative 


1231 


41 


920 j AF102177 


Homo sapi ens 


tumor antigen SLP-8p 


1260 


97 


921 


AL096712 


Homo sapiens 


d0744I24.2 {similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thai iano 


putative WD- repeat protein 


866 


42 


922 


AL161495 


Arabidopsis j 
thaliana 


putative WD-repeat proteir. 


442 


36 


924 


U9700I 


Caenorhabdit 
is elegans 


similar to 

Scnizosacchoromyces pombc 


605 


SI 


925 


X7197B 


Mus mus cuius 


Fif 


1503 


95 


92b 


K92288 


Droeophila 
melanogaster 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 


1392 


100 


926 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l . 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose-5-phosphate- 
epimerase 


912 


100 


931 


026991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 
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TABLE 2 



S£Q [ ACCESSION 
ID I NUMBER 
NO: 


SPECIEf 


DESCRIPTION j SMITH - 
i WATERMAN 
SCORL 


IDENTJ TY 






is elccanE 


cr\2 ic 


\ 


93; 


AL08OO6!: 


Homo sapiens 


hype:, he t i ca 1 protein 


21C 


25 


93? 


C0i384 


Homo sapiens 


Hun.cn secreted protein, SEC 
JD NO: 5S€b. 


76'. 


96 

I 


934 


AO 2 7 64 6 5 


Homo sapiens 


inttcral membrane transporter I :200 
protear- 


iob 


935 


AL03 56E1 


Homo sapiens 


dJ7S6G23.3 (novel protein \ 1142 
similar to drooophila i 
transcriptional repressor! ! 


80 


936 


AB026808 


Mus museums 


synapiotapmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


KR3HFfc22J< 


2 6 0~1 


99 


938 


X65724 


Homo sapiens 


orf: 


4 St 


100 I 


93$ 


W89024 


Homo sapiens 


Polypcpticc fragment encoded 
by oene 156 . 


1461 


100 


940 


G04 04 7 


Homo sapiens 


Huir.tn secreted protein, SEC 
ID NO: 812* . 


ir. 


100 


911 


AF094S8J 


Homo sapiens 


putative HIV-l infection 
related protein 


4 5, 


100 


942 


AC02420C 


Caenorhabdit 
is eleoans 


contains similarity to 
several zinc finger proteins 
but not to the zinc fincer 
domdir:.' 


3.^C 


69 

i 


94 3 


AF1297S6 


Homo sapiens 


G5t 


27% 


100 


944 


K2j76£ 


Rattue 
norvegicus 


alphc- 1 ropemyosi n 


131- 


96 


945 


AC009917 


Arabidopsis 
thaliana 


Concerns similarity to 


58> 


47 


94 6 


AF223468 


Homo sapiens 


AZ)02i protein 


551 


44 


94 7 


AF0S5473 


Homo sapiens 


GAGE - I 


27*- 


51 


94 6 


X7S7SE 


Homo sapiens 


protein kinase C mu 


2019 


68 


94 9 


AF1439S6 


Mus mus cuius 


corcr.in - 1 


2iOC 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


1 061 


99 


95 j 


W4S043 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


2 02 


67 , 

i 


552 


AB01686i 


Arabidopsis 
thaliana 


ge:ie_id:MXC17 .7- 


20? 


46 

1 


953 


Y01781 


Homo sapiens 


Human ubigui tin- con jugatinc 
enzyme >Y2S341 Y2S341 01-JUL- 
199 9 12 -AUG- 1998 Human NCF. ■ 2 
protein 


3£L 


100 ! 

| 


954 


AF14561S 


Drosophiia 
melanogaster 


BCDNA.GHQ3 377 


e2:- 


46 


955 


U09410 


Homo sapiens 


zinc finger protein 2JNF131 


> <1 £3 


99 


956 


U09410 


Homo sapiens 


zmc finger protein ZNF131 


1853 


99 


957 


AF1S5623 


Homo sapiens 


chol inephosphotransf erase 3 
alphs 


2i2e 


99 


95 * 


X94917 


Drosophi 1 a 
melanogaster 


heac-eievated expression in 
0.9 k* 


15: 


32 


959 


U54807 


Rattus 
norvegicup 


GTF- binding protein 


::67 


97 


96C 


AF05e807 


Bos taurus 


GTP-c?:nding protein rah \ 6 Of 


r~97 


961 


G03244 


Homo sapiens 


Human secreted protein, SE0 
ID NO: 7325- 


47* 


100 


962 


AF078B50 


Homo sapiens 


steroid dehydrogenase honolog 


58? 


40 


963 


AP0017S4 


Homo sapiens 


transient receptor potential - 
related channel 7, a novel 
putative Ca2* channel protein 


317 


30 | 


964 


AL03S419 


Homo sapiens 


dJ3lOOHl3.I (putative novel 
protein! 


1129 


100 


96S 


X61381 


Rattus 
rattus 


interter on- induced protein 


20; 


46 


966 


D38169 


Komo 
sapiens 


inositol 1,4.5-trisphosphate 
3-Xmase isoenzyme 


327* 


100 


967 


AL031432 


Home 
sapiens 


dJ4 6 5N24.2.1 (PUTATIVE novel 
protein) (isoform 1} 


651- 


100 



)67 



BNSDOCID. <WO 0153312A1 J. : 



WO 01/5333; 



PCT/US00/3J263 



I ABLE 2 



SEC 


ACCESSION 


SPECIE' 


~~ DESCRIPTION 


SNttTK- 


V 


ir 


DUMBER 






WATHRMAN 


IDENTITY , 


NO: 








SCORE 


i 


~ 3€T " 


U7 9 27b 


Homo sapiens 


unKr.ovi. 


611 


100 


| 96< 


AOC11306 


Home 
sapiem 


qjamne nucleotide exenanc* 
factor (long isoform! 


27S; 


95- 




AF261134 


Komo sapiens 


exo.«?ome component Rrp4t 


118fc 


100 




UE>333fc 


Csenorhabdit 
is elegant 


weak cimilarity over a ohor t 
region to myosin heavy chair. 


536 


21J i 


1 972 


AC01B749 


Leishmania 
ma j or 


L8840.12 


589 


S3 




t 97- 


AFl 8B504 


Mus musculus 


liNV 


544 


8L 




U2S80- 


Homo sapiens ] 


Taxi binding protein 


852 


9£ 


971 


AF049S23 


Homo sapiens 


hunting tin- interacting 
protein HYPA/FBP11 


1390 


97 


97( 


AF161S30 


Homo sapiens 


HSPC182 


1040 


100 


1 977 


G0402O 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 8101 . 


626 


100 


97t 


AF3 64797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


| 97 S 


U94 991 


Xeriopui 
1 aevi s 


transcription factor XLMD3 


795 


97 




S7377? 


Homo sapiens 


calmitme; calsequest rine 


2029 


100 


j 98. 


Y94 888 


Home 
sapiens 


Human protein clone HP01461 


2501 


100 




AJ243191 


Homo sapiens 


heat shock protein 


827 


9€ 


96> 

1 


X65020 


Bos taurus 


PSST subunit of the NADK: 
ubiquinone oxidoreductase 
complex 


964 


as 

i 


98 < 


AJ249207 


Rhooccoccus 
sp. AD4 5 


putative ractmase 


351 


43 




Z30093 


Homo sapiens 


basic transcription tzcioz 2, 
35 kD subunit 


1576 


99 


S8f 


AB03083S 


Homo sapiens 


contains two glutarcine rich 
domains, three 2inc-f :nge: 
domains, and matrin 
homologous domain 3 (MH3) 


4697 


99 


98' 


AF227258 


Bos taurus 


RPGR- interacting protein-l 


1262 




98f- 


AL022238 


Homo sapiens 


du*1042K10.2 (supported by 
GENS CAN, FGENES and GF.NEWISE) 


4048 


95 


985- 


ALC2223B 


Homo sapiens 


dJl042K10.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


2321 


99 

1 


99C 


AF161426 


Homo sapiens 


HSPC30E 


448 


92 » 


993 


AF361426 


Homo sapiens 


HSPC308 


448 


92 


992 


AF161426 


homo sapiens 


HSPC30B 


453 


92 | 


9s: 


AL023859 


Schizosaccha 
romy ces 
pombc 


trna- splicing cndonucicasc 
subunit 


172 


42 


ogx. 


AL049631 


Homo sapiens 


d05l3M9.1 (novel Homeobox 
domain protein) 


241 


47 


99 


AC0C5253 


Homo sapiens 


R26445_l 


902 


100 


99< 


AF265206 


Komo sapiens 


M0G1 isoform A 


974 


100 


99*: 


AJ248285 


pyxococcus 
abyss! 


sarcosine oxidase, subunit 
beta (soxB) 


195 


28 


99* 


AE0G3641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


SB 


999 


W69343 


Home 
sapiens 


Secreted protein of clone 
CR930_1. j 


1340 


99 


1000 


AY007135 


Homo sapiens 


similar to bevine AD?/ A TP 
translocate Tl mRNA with 
GenBanX Accession Number 
M24 102.1 


1543 


100 


1001 


Y73382 


Homo sapiens 


HTRM clone 1877278 protein 
sequence. 


1660 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ4 62023.2 (novel protein) 


2058 


10c 


1005 

1 


S45367 


Canis 

f amiliaris 

1 ■ — ■ 1 


centraccln 


1949 


100 
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FABLE 2 



L_ N £l 


ACCESSION 
NUKPER 


SPECIES 


DESCRIPTlCr* 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1 lOOt [ S4 536 7 


Cam < 

famil iari: 


cer.t ractir. 


131b 


96 


1007 | AB022158 


Mur 

muscuiu^ 


cnr.peronin containing TCP-] | 264 9 
eoiijon subunit 


96 ; 


riOCfc | 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 3fc. 


1282 


97 


|~1005 i AB011414 

i 


Homo sapiens 


Kruppel-type zinc finger | 1671 
protein I 


58 


1011 | Z68216 


Caenorhabdit 
is elegans 


K01H12.1 


26$ 


67 

1 


j 1011 | AB011414 


Homo sapiens 


Kruppel-type zinc zingei 
prot ein 


1671 


58 

I 


101? | 21400C 


Homo sapiens 


ring: 


2017 


100 


(T6l3 ( G02841 


Homo sapiens 


Human secreted protein, SEC 
ID .NO: 6922. 


332 


93 

1 | 


jl014 


AF145659 


Drosophila 
melanogaster 


DcDNA . GH10333 


1244 


| 52 
i 


1015 


Y02860 


Homo sapiens 


Frccjrrte.it of human secrctec 
protein encoded by u«ne 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein 


77x. 


! 


1017 


V9S446 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence S£Q ID N0:37*;. 


23 23 


100 1 


1010 


X6725C 


Rattui 
norvegi cus 


n-cnimeerm 


1710 


97 


1019 


AF183417 


Home 
sapiens 


microtubule - as soci a tec 
proteins 1A/1B light chain ; 


631 


100 


1020 


AF164795 


Homo sapi ens 


Bex- regulated protein }anus-a 


674 


100 


1021 


AF190625 


Cvtvrni* 
coturni >. 


qdei - i 


636 


96 


1022 


AL133363 


Arabiriopsis 
thai i ana 


putetive protein 


15f 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat lake sequence 


2483 


1C0 j 


1024 


AY007091 


Homo Gapien3 


similar to Homo sapien; 
mammalian inositol 
hexakisphesphate kinase 7 
(IP6K2) mRNA with Ge 


2243 


1C0 


1025 


X69910 


Homo sapiens 


P63 protein 


2959 


95 


1026 


U80736 


Home sapiens 


CAGP5 


1657 


100 | 


1027 | AB029333 


Halocynthia 
roretz: 


HrPET-1 


1046 


j 


102* | AB032931 


Homo sapiens 


ubiouitin-conjugatmg enzyme 
i so Jog 


1045 


100 


2029 | G01797 
1 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 5878 . 


749 


98 , 
1 


1030 1 G01797 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 5878. 


749 


96 i 


1031 j AF193795 


Homo sapiens 


vacuolar sorting proteir. 
VPS29/PEP11 


960 


IOC 


1032 I AJ222968 


Mus musculue 


L-periaxin 


120 


30 


1033 | 281317 
1 


Schizcsaccha 

romyces 

pombe 


DNA2-NAM7 helicase tomily 
protein 


685 




3 034 


Y41515 


Homo se.pi ens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


103 5 


AJ276004 


— n s 

Mus musculus 


Poxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elecans 


H14A12.3 gene product 


190 


30 


1037 


!~U37251 


Homo sapiens 


Description: K£AB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 1 W74580 


Homo 
sapiens 


Human membrane protein 
BA03O6. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak, similarity ^c 
Arabidopsis thaliana 
ubigoitin-like protein e 


331 


80 



J69 
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TABLE 2 



ID 
NO: 


^ACCESSION 
NUMBER 


SPECIES 


descri prior- 


SMITH- , 
WATERNIAN | IDENTITY ! 
SCORE 


T04C 

f 104 a 


" AF2 9cT64 


he mo sapiens 


biood group earner moiecuie 
DOK5 


163V i 95- " ) 


Y9673C 


Hotn( 
sepiem 


PR0539, a Costal -2 hotT.oiogue. 


162 | 2, 


3 042 


AF140663 


Mus T.UGCOlUS 


F-box protein FViDi 


2397 


9* 1 


104* 


AF151023 


Homo sapaens 


i-spcies- 


1104 


10C 




AF181631 


Drosophila 
melanoaaster 


bcD*)A.GH0492? 


204 




1045 


Y7798I 


homo sr. pa ens 


Human ccilectin amino acid 
sequence . 


194C 


100 


1046 


AJ243572 


Homo sapiens 


6-phosphogluconolactonase 


1317 


100 j 


3047 

j 


A3035863 


Homo sapiens 


ATP opecific succ:nyl CoA 
synthetase beta subunii 
precursor 


2324 


9S 


j~io4e 


AL034S50 


Homo sapiens 


O011B4F4.2 (novel protein 
similar to nucleolar protein 

4 :nom) (nolp) ; 


981 




104 S 


AF163825 


Homo cap i ens 


pre-B lymphocyte protein '* 


634 


100 


105C 


AF20194? 


Hcmo sapiens 


60S ribosomal protein L31 
isolog 


86* 


loo 


1051 


AF190624 


Mus musculus 


meg 1-1 


236 




1052 


AF003529 


Drosophiia 
mei anooaster 


CG6151 ger.e produci 


160 


i 44 


1053 


G01193 


Homo sapiens 


Human secreted protein, SE0 
ID NO: 5272 


646 


96 


1 054 




Nei sseiri & 
meningitidis 


Glu- f. KNA (Gin! 

amidotransierase subunit h 


6e; 


44 


1055 


AF1 91 856 


^attu< 
norvegi cus 


tRNA eel tnocy s t ei nt 
associated protein 


1525 


99 


1056 


U83649 


Chi amydomonc. 

£ 

reinhardtii 


Mxl9,000 outer arm dyneir. 
light chair. 


244 


34 


1057 


AF159143 


Homo sapiens 


breast cancer metastasis- 
s upp r e s s or 1 


663 \ 5? 

i 


1056 


AF230929 


Home 
sapiens 


keratmocyte annexin- 1 ike 
protein pemphaxin 


171C 


9£ 


1059 


AJ2709S2 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


Heterodontus 
trancisc: 


HoxDfc 


742 


83 

i 


1061 


5(63417 


Homo sapiens 


1RLE 


103'/ 


10C 


1062 


AL079345 


Streptomyces 
coelicolor 
A3 (2) 


hypothetical proteit 


143 


27 f 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein-10 
(HYDRL- 10) . 


2547 


100 


1064 


AF363614 


Homo sapiens 


acetyl -CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence oj 
protein PR0221 . 


1363 


IOC 


1066 


AC006153 


Homo sapiens 


similar to Aquizex aeolicus 
GTP-binding protein; similar 
to AE000771 <PID:g2S84292) 


662 


98 


1067 


Y18930 


Sulfolobus 
sol fa ten cus 


hypothetical protein 


162 


29 


1066 


R6S969 


Home 

sapiens T98G 


Gl iobl astoma -der ivec 
polypeptide. 


887 


100 


1069 


Y07964 


Homo eapiene 


Human secreted protein 
f ragmen t 


863 


96 


1070 


AF1774 76 


Rat cus 
norvegicus 


CDX5 activator-binding 
protein 


1995 


86 


1071 


AF245S0S 


Homo sapiens 


adlican 


3109 


99 


1072 


US2794 


Hus musculus 


aJpha glucosioase 11, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SSQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p7d 


380 


28 


1075 


Y13392 


Komo sapiens 


Amino acid sequence of 


1271 


91 



170 
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TABLE 2 



ID 

NO- 


ACCESSION 1 5FF.CIE5 ; DESCRIPTION 
NUMBER 1 

\ j 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 

i 






protein PR032fc . 


i 


107G 

i'orf 


AFIG1457 j Homo sapiens 


HSPC339 


571 | 101 


Y7 9T>09 


Homo iiapiens 


Human carbohydrate associated 
protein CKBAP- - . 


2151 j 9fc 
i 


1078 


AF222466 


Hcn.o sapiens 


HT015 prcteir. 


831 


6t 


1079 


AL13296S 


ArabTccpsi s 
thai ;an& 


putative WD-40 repeat -protein 


286 


25 j 


1080 


AB024 937 


Hon.o sapiens 


LUNX 


1284 


100 I 


1083 


Vi476e 


Homo sooiens 


V-ATPase G-suounit likt 
protein 


579 


100 _ 


1082 


AF016416 


Caer.orhabdi t 
is elecane 


F29A7.4 gene proauc; 


141 


IT 


1083 


1.13291 


Homo scpiens 


ADP- r ibosyl arginine hydrolase 


802 


4 5 


1084 


AB041541 


Mus mus cuius 


unnamed protein product 


151 


44 


•085 


G01S22 


Home sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 

"l"0 87 


AB03O814 


Hofi-.o sapiens 


H-RSV107 protein honoioc 


833 


IOC 


AF15163& 


liorno sapiens 


phosphatidyl choline transter 
protein 


1142 


101 


1088 


Y8*432 


home sapiens 


Amino acid sequence ot c 
human RNA-associated 
protein . 


2783 


100 


1089 


Y94867 


Home 
saoi ens 


Human protean clone HP10563. 


613 


100 


1090 


AK02398/ 


Homo sapiens 


unnamed protein product 


130 


49 


3091 


AB041 bfcJt 


Mus n.usculus 


unnamed prctem product 


1103 


8i 


1092 


Y71277 


Home sapiens 


Human Zlipo3 protein 


606 


10C 


1093 

i 


U34973 


Mug niuaculus 


protein tyrosine phosphatase - 
like 


1133 


95 


t 1094 


Y66 6 77 


Home 

sapienr 


Membrane- bound protein 
PR082 8 


522 


5( 


| 1095 


Y872 /6 


iiomo sapiens 


Human signal peptic* 
containing protein HSPP-53 
SEQ ID NO: 53. 


1025 




1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SBO ID NO: 53. 


863 


9b 


1097 


AF161455 


Homo sapiens 


HSPC33 7 


742 


96 


10 98 


U80029 


Caencrhabdit 
ic eie°ans 


similar to thiorcdcxm 


242 


i 


1095 


AJ005066 


Homo sapiens 


Sgv-7-like protein 


1321 


9b 


1100 


AJ005866 


Homo sapiens 


Sqv-7-Jike protein 


1118 


99 | 


1101 


A000586t 


Homo sapiens 


Sqv-7-like protein 


891 


9? 


1102 


AJ005866 


Homo sapiens 


Sgv-7-like protein 


1016 


99 


1103 


ALII 024 4 | Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 | Droscphila 

| raelanoaaster 


brakeless-E 


147 


52 


1 JOS 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.S) 


968 


100 


1106 


U2801C 


Mus musculus 


parathion hydrolase 
(phosphotriesterase) -related 
protein 


1624 


87 


11C7 


AJ27835C 


Homo sapiens 


putative lipid kinase 


2207 


99 


:i08 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814 . 


4 95 


3b 


1109 


AF217287 


Drosophlla 
melanogaster 


G protein RhoBTh 


834 - 


54 


1110 


Y28921 


Home 
eapienp 


Human regulatory protein 
KRGP-7. 


941 


48 


1111 


Y28921 


Home 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


53 


1115 


AF176704 


Homo sapiens 


F-box protein FEX9 


2027 "1 


99 


1113 


AF182076 


Home 
sapiens 


glioma tumor suppressor I 2418 
candidate region protein 2 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEO | 475 


96 
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TABLE : 



SEO 
ID 
NO: 


ACCESSION ( SPECIES 
NUT42fcK 


DESCRIPTION 


SPiiTh- ; * 

WATF R MAN j 1 DENTITY 

SCOR: | 




" i 


ID WO: &12L 






1115 


A.-229435 I Kua musculus ; zinc finocr protein 281 


16 5* 


\Sj 1 


\ 1116 


L4C357 


Homo sapiens | thyroid receptor mteiact.or 


505 


10C 


1117 


L4C357 


homo sapiens j thyroid recept cr \nteractor 


<(K 


8: 


1318 


A12D5! 


Homo sapiens 


Human XSL cDN/-. 


167} 


10( 


lilt 


A1.161542 


Arabioopsis 
thaliana 


isotr.erase lixe protein 


60? 


5> 


112C 


ALC23754 


Homo sapiens 


OJ272L16.1 (Rai 
Ca2*/Calmodul in dependent 
Protein Kinase LIKE protein) 


234 ■ 


9> 


1121 


YS790j 


Hcmo sapiens 


Human transmembrane protein 
KTMPN-2S. 


321 


3C 


1122 


Z14122 


Xenopus 
laevis 


xlcl; 


f 455 


77 


1123 


AF22543* 


Homo sapiens 


lipase 


353: 


9'/ 


1124 


Y06S18 


Homo sapiens 


Zen GTPase interacting 
protein ZI F 


3227 


10C 


1125 


ALC35690 


Hcmo sapiens 


du*202I21.i tnovel protein) 


9s: 


100 


1126 


AJ000217 


Homo sapiens 


CLICi 


326f 


95 


1127 


AB030505 


Mus muerulus 


UBE- lei 




75 


1128 


Y73375 


Homo sapiens 


HTRM clone 1427838 protein 
sequence . 


874 


lot 


1129 


Y78941 


Homo sapiens 


Cyciophilin-type peptidy] 
prolyl cis/trans isomerase 
amino aeic sequence. 


877 


| 100 


1130 


AL023553 


Homo sapiens 


dJ34 7H13.4 tnovel protein) 


557 


IOC 


1131 


Y91945 


Homo sapiens 


Human ehaperone protein 6 
(HCHP-6) . 


1408 


10C 


1132 


Z68197 


Schi zosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


35 


1133 


Z66197 


Schi zosaccha 

romyces 

pombe 


putative nuciear pore protein 


389 j 31 


1134 


AF3 80681 


Homo sapiens 


guanine nucleoside exchange 
Cactoi 


3597 


100 


1135 


AF079765 


Mus imis cuius 


enhancer ot poiycornb 


264 


41 


1136 
il 37 ~ 


M62419 


Mus musculus 


clathrin-aesoc iated protein 


2189 


95 


AO 006 2 15 


Drosophila 
melanogastcr 


clathrin-afcsociated protein 


125< 


It 


1130 


Y76216 


Homo sapiens 


Human secreted protein 
encoded by aene 95. 


44L 


9£ 


1139 


wee:o4 


Home 
naoiens 


A Rab protein designated 
HRABS- 2 . 


1065 


95 


1140' 


• Y13403 


Homo sapiens 


Amino acid sequence of 
protein PR02 35 . 


3975 


9E 


1141 


WE5026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product . 


330S 


100 


1142 


Y13402 


Homo sapienB 


Amino acid sequence or 
protein PRO310 


1694 


95 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 7956 . 


660 


99 


1144 


Y12927 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Komo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


10C 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
{PROTEIN DXF34) j 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ; 


1233 


100 


1148 


G02S48 


Homo sapiens 


Human secreted protein, SEO. 
ID NO: 6629. 


3 70 


99 


1149 


Y73338 


Homo sapiens 


HTRM cione 2019742 protein 
sequence . 


1492 


100 


1150 


W74641 


Homo sapiens 


Human secreted protein 
encoded by aene 113 clone 


228 


55 
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TABLE 2 



ID 
NO. 


ACCtSS"iON~] SPECIES 
NUMBER | 


DESCRIPTION 


WA7ERKAN 
SCORE 


IDENTITY 






KKAAR61- 




1) SI 


AF04 4 201 


Rat tut" 
norvegicus 


neural vr.fttr.br ane protein 3b; | 1571 
NMP? ^ 


92 


13 Si 


AF156774 


Home 
sapiene 


iysopnosphatidi c acid 
acyl t ransf erase -gamma 1 


105? 


I 9S 


1153 


Abl 18501 


Homo sapiens 


OJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A094 6, Em : AL050069 ) ) 


87. 


64 


US'. 


AF131852 


Homo sapiens 


Unknowr. 


4 7 - 


100 


ilSb 


Y417Ch 


Home 
capi ens 


Human PR03S2 protein 
sequence . 


13E-. 


97 


1156 


G04 03* 


Homo sapiens 


Hurr.£n secreted protein, SEQ 
ID NO: 8 1 1 7 


60-. 


99 


1157 


AK112444 


Lupmus 
In t eus 


b-asparaginase 


287 


43 


ii se 


AF151846 


Homo sapiens 


v-oj-yu protein 


23, 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


244 5 


100 


1160 


ABO 01773 


Clona 
savigny i 


PEM- 1 


1 y t 


33 

l 


1161 


Y8 73 3 f 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


74< 


" 1 


1162 


Y 8 7 3 3 G 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 

ity }V NU: 10 / , 


74 ( 


fl 3 

i 
1 


116 3 

116 4 


AF123534 
Ar 232226 


Homo sapiens 


HP1-BP74 protein 


2 7 2 _• 


96 


Damo reno 


Dedd 1 


1 91 


4 1 


1 • OS 


Abl 1 8 501 


Homo sapiens 


dJii9lN16.1 (A novel protein 
(translation of the cDNA 
DKKIip566A0946, Em : AL050069) ) 


lot: 


71 


116 6 


Ml j 1 J 03UJ 


Homo sap ions 


ajuyjNib.J. ia novel protein 
(translation of the cDNA 


94 ! 


76 


1167 
"il&8 


AF1 87733 


Homo sapiens 


syntaphi lin 


83j 


42 








95 


55 




AF064 604 


Homo sapiens 


KE03 protein 


324 


33 


11 7 0 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6 


1193 


100 


1171 


L031B8 


Sac char omyce 
s cerevisiae 


putativ< 


181 


22 


1172 


AF113 751 


Mus muscuius 


nuclear pore membrane 
glycoprotein POM210 


3 94 - 


81 


1173 


AJ24S417 


Homo sapiens 


G5b protein 


79* 


100 


1174 


AL022238 • 


Homo sapiens 


dJ1042K10.3 (novel protein) 


i2e. c 


100 


1J7S 


U41278 


Caenorhabdit 
is elegans 


F3 3G12.3 gene product 


33^ 


28 


1176 


M3S617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


28< 


83 


1177 


AC012680 


Arabidopsis 
thai iana 


putative protein phosphatase 
2C; 55455-56414 


205: 


37 


1176 


G013 4S | Homo sapiens 

i 


Human secreted protein, SEQ 
ID NO: 5426. 


69; 


99 


1179 


AL096767 


Homo sapiens 


^J579N16.3 (novel protein 
similar to wortn, Arabidopsis 
and pine proteins) 


134 2 


100 


11BC 


AF039716 


Caenorhabdit 
is elegans 


similar to ATF synthase B 
chain 


496 


55 


1181 


Y1172C 


Komo sapiens 


collagen type XIV 


1048 


97 


1182 


XB2240 


Homo 
sapiens) 
>R94974 
R94974 0S- 
MAY-1996 27~ 
OCT- 1994 j 
Human TCL-1 
polypeptide. ' 


T cell leukemia/ lymphoma 1 


617 


100 
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TABLE 2 



SCO 1 
Vu 1 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SM1TH- 
WATERMAN 
SCORE 


•* 

IDENTITY 






(Homo 








11C3 




Caenor habdit 
is e>?sns 


short region c£ weak 
similarity to collagen 


16] 


3i 1 




AJ131613 


Homo sapiens 


dicarncxyl ate earner protein 


147C 


99 j 




L2764S 


Danio reno 


Qrovth-associatec protein 


130 


36 , 


116' 


Y0273B 


Homo sapiens 


Human secreted protein | 636 
encoded by gene e9 clone | 
HLHFP03. 


100 

I 


118b 


AF217S04 


Xenopuf 
laevis 


oriji inine uetdi oGAyi obc ^ 


14 


60 


lies 


Al, 136307 


Homo 32piens 


dJ38038.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 ! 

1 
i 


1 19C> 


<i607 


Homo sapiens 


rTSbeta 


19", 


100 




U3 2S28 


Ha err.ophi i us 
i nf 1 uenzae 
Kd 


ribocomal protein St 
modification protein (rimK; 


26 6 


31 | 


1192 


AF154631 


Rattus 
norvegi cus 


PV- 1 


14 02 


60 




Y S 0926 




:-.urr.an ietai brain cDNA clone 
vclSl derived protein 


91* 


100 1 


1194 


AF026530 


Rattus 
norvegi cus 


stathmxn-like-protein splice 
variant RB3 ' 1 


1093 


97 


1 1 9b 


U3 524 4 


Rat t us 

norveg.2 cus 


vacuolar protein sortinc 
homolog r-vps3 3a 


29e: 


96 


119t 


V7O470 


Homo sapiens 


Human p53 target molecule, 
FRG3 protein. 


1680 


ICQ 


119-/ 


AF:5731B 


Homo sapiens 


AD-017 protein 


912 


47 


list 


Ar 1 /:>44 J 


Caenorhobdi t 
is e 1 e g a n s 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB : Z28295) 


460 




:i99 


AF201934 


Homo sapiens 


DC 12 


16 4 9 


ee 


1200 


AL031775 


Home sapiens 


OJ30M3.3 (novel protein 
similar to C. eleg^ns 


1902 


10C 


1202 


K2I103 


Ovis aries 


B13 1B4 high-sultur keracir. 


1 484 


82 


1 V U * 


Z8 c -98£ 


noniu 53<ipj ciib 


suppreoaor protein SRP40) 


1143 


"75 


12 03 


U3 07 G 2 


Ra etui' 
norvegi cur 




89C 


52 


3204 


U35730 


Mus mus cuius 


1 erkv 


223b 


76 


l20r 


ABOC2327 


Homo Sop i ens 


K1AA032 5 


151 


24 


1 2 Ofc 


> M V _1 -7 *1 _> -J 


Avahi rinn c ^ r 

tha 1 iana 


ubi qui none/menaqui none 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


A1>1 36307 


Homo Sc.pl ens 


d0380B8.2 (Neuritin, l 
protein which promotes 
neurite outgrowth) 


742 


100 


120b 


AF2 07989 


Homo cap lens 


orphan G-protcin coupiec 
receptor 


2326 


100 


1209 


ZS7 63 0 


Homo sapiens 


dJ466N1.4 (novel protein, 
similar to AKK3 (ankyrin 3. 
node of Ranvier (arJcyrin 
G) ) ) 


181 


44 ; 

t 

i 

1 


1210 


U21S49 


Mus mueculus 


Ac 3 9/physophi 1 i n 


1280 




1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 | 


1212 


AF1 17814 


Mus muscuius 


odd-skipped related 1 protein 


94b 


1 66 


1213 


AF277233 


Naegleraa 
fowlcri 


calcineurin E 


222 


I 39 


1214 


D14849 


Mus muscuius 


meiosis-specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


1216 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 ! 


49 
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TABLE 2 



Veq 1 
ii 1 

NG : j 


ACCESSION 
NUMBER 


S P FX IKS 

1 


INSCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






is eiecans 


protein (Swiss Prot accession 
yk67?hll.5 comes from thi^ 
cent 






i2i~ 


Z49703 


Saccharcroyce 
s ccrevisiae 


unknowi. j 13* 


2/ ■ 


i2ie 


ACO3 3 4 30 


Arabidcpsis 
thaliana 


F3F9.1E 


199 


29 


1219 


L10910 


Home sapiens 


splicing factor 


[ 1026 


71 


122C 


Z707S0 


Caenorhobdit 
is elegans 


similar to vanadate 
resistance protein 
trans:m=mbranous comes zkov 
this gene 


965 


5H 


"l222 


AL16381S 


Arabidopsis 
thaliana 


putative protein 


653 


61 


AF2S5100 


Homo sapiens 


zinc linger protein KY-REN-21 
antigen 


226i 


100 


1221* 


J05071 


Bos tajrus 


GTP- binding regulators- 
protein gamma- 6 3ubunU 


3S* 


10c 


2224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein | 3165 
sequence . 


99 


122L 


ALOS0170 


Homo sapiens 


hypothetical protein 


714 


10c 


1226 


X64002 


Homo sapiens 


RAP74 


2661 


99 


1221 


X04085 


Homo sapiens 


catal as€ 


2846 


100 


122f 


A0005620 


Mus musculus 


skeletal muscle- speci f ic yene 


14 16 


90 


122S 


AF045564 


Rat cm 
norvegicus 


oevelopment-related protein 


1715 


93 


12 3C. 


xy757l 


Mus muscuJus 


HCMV- interacting protein 


479 


96 


123: 
12 32 


1.0B239 


Homo rapicne 


located at OATLl 


2274 


300 


AF121 863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdi t 
is clcgans 


contains similarity tc 
TR:O0459S 


744 


31 


1235 


AC006634 


Cacnorhabdit 
is eleoans 


contains similarity tc 
Saccharomyces cerevisiat 
probable membrane protein 
YLR418C <GB:U20162) 


357 


33 


1236 


Y18101 


Mus tnusculus 


macrophage act in-associated- 
tyros ine -phosphoryl a t ed 
protein 


1559 


87 


123'; 


AB042646 


Homo sapiens 


TSIF2 


1224 


100 


123F 


AB026264 


Homo sapiens 


1 MPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


12*0 


G00429 


Homo sapiens 


Human secreted protein, SEC 
ID NO; 4510 . 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21- 


1363 


53 


124 2 


AL035602 


Arabiaopsis 
thaliana 


putative protein 


499 




124? 


X76483 


Gallui 
gallu* 


Yee-aesociated protein 
<6SkDa) 


574 


4e 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


124 5, "1 


AL021453 


Homo sapiens 


dJ821D11.3 \ PUTATIVE protein) 


856 


iob 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


2246 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyl transf era 
ee; similar to Q07537 
(PlD:gll71989) 


957 


100 


3249 


AK1 99597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


1250 


Y13248 


RattuE 
norvegicus 


PAG60fc 


1350 


> 


1251 


M24 852 


Rattus 
norvegicus 


neuron- specif ic protein PEP- 
19 


124 


46 
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TABLE 2 



i id 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DHSCKl PTIOI-- 1 


SMITH- 
WATER MAN 
SCORE 


IDENTITY | 


1252 


AF14673 8 


Rat.tui 


testis Epecitic proteir. 


771 


-i 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEC 
ID NO; 68CC . 


415 


" i 


1254 


VI4 4 375 


Home sapiens 


Human ubioui t xn~ connugat mg 
enzyme polypeptide 


1045 


1 


12 55 


ACO06S38 


Homo sapiens 


BC41195_1 


831 


7t 


1256 


AB004316 


Bos taurue 


mitochondrial met hionyl - t RNA 
trans formyl asc 


1556 


8f 


f 1257 


Z35094 


Komo sapiens 


SURF- 2 


1354 


s: 


1256 


Y13 362 


Hcmo sapiens 


Ammo acic sequence ot 
protein PR0214 . 


2383 


100 


1259 

I 
i 


AC006014 


Homo sapiens 


similar to KFP transtornung 
protein; similar to P14373 
{PlD:gl32517; 


1299 


10C 

1 


| 1260 


AC005095? 


Homo sapiens 


match to Al 222572 
(NlO:g3804775) 


469 


100 


p26i 


V00507 


Homo sapiens 


coding sequence ot DHFR (1 is 
1st base an codon) (561 is 
3rd base in codon) 


984 


10C 


1262 


X1S443 


Rattus sp . 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


31 


1263 


AF173B71 


Mus muscxilus 


neuronal PAS 3 


977 


94 


1264 


AF178983 


Homo sapieno 


Ras- associated protein Rapl 


433 


97 


1265 

L 


Y70473 


Homo sapiens 


Human cyclic nucleoti de- 
associated protein- 1 (CKAP- 
1) . 


2785 


95 


1 1266 


Y4I738 


Home: 
sapier.i 


Human PR054 1 protean 
sequence . 


1622 


1CX 


1267 


AF061346 


Muo [niioculus 


Edpl protean 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elecums 


C13F10.4 ffene product 


154 


23 


|~~12 6 9 


AF233582 


Mus muscul-js 


GJPase KabJ / 


942 


91 


f 1270 

i 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


1 


( 1271 


AL031177 


Homo sapiens 


dJ8QSM15,3 (novel protein) 


1150 


5b I 


| 1272 


AF201933 


Homo sapiens 


DC11 


650 


1 0( 


(" 1273 


AF201933 


Homo sapiens 


DC11 


346 


9k i 


f 1274 


AL02171O 


Arabidopsi 9 
thaliar.a 


putative proteir, 


348 


4!> • 
i 


127S 


AC004449 


Homo sapiens 


R33663_3 


556 


10C 


1276 


Y8629S 


Homo sapiens 


Human secreted protein 
HL2AGB7, SEO 3D NO: 210. 


1920 


10C- 


1277 


Y713U 


Homo sapiens 


Human Hydrolase protean- 
(HYDRL-9) . 


1576 


9S- 


1278 


S94421 


Homo soparns 


T cell receptor eta-exon 


478 


10C 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344. 


1909 


IOC 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


IOC 1 


1281 


Y4861D 


Komo sapiens 


Human breast tumour 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidcpsis 
thai iane 


Similar to A1G1 protein 


406 


3i: 


1283 


AK024432 


Komo sapiens 


FLJ00022 protein 


4 03 


3£ 


1284 


W96~1S3 


Komo sapiens 


Human FADD- interacting 
protean (FIP) . 


1825 


81 


1285 


AO001019 


Komo sapiens 


ring finger protein 


1301 


10D 


1286 


AE0C3823 


Drosophila 
melanogaster 


CG13178 gene product 


19S 


29 


1287 


AF178632 


Komo sapiens 


FEM-l-like death receptor 
binding protein 


32*61 


100 


1288 


ACO06O3 3 


Home i 
sapiens 


similar to MLN 64; similar to 
138027 (PID;g2135214) 


1195 


200 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64,* similar to 
138027 (PlD:g21352l4) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 7j 
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TABLE 2 



SEQ 
ID 
NO.: 


ACCESSION 
NUMBER 


SPECIE!' | DESCRIPTION* 


SM 1 TH - 
WATERMAN 
SCCRF. 


IDENTITY 1 


129a 


273424 


Caenorhabdi t 
is eipcanf 


C44R9 . 1 


235 

) 


~3T~ j 


1292 


Y94871 [ 


Horr.c 
sapient 


Human protein clone HP025S^ . 


122i 


IOC 


1 293 


AF19042S 


Hcmo sapiens 


ret i nobla stoma- as sod stec 
protein RAP140 


489 


i 


1294 


G03356 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 




1295 


AF133670 


Mus ir.usculus 


ARb-6 interacting protein-? 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


IOC 


1297 


X5756C 


Escherichia 
col; 


pspE protein 


525 


10C 


1298 


AF169284 


Homo sapiens 


LIM and cyst e ine- rich domains 
protein 1 


1997 


10C ! 


1299 


U41023 


Caenoi habdi t 
is elegant 


coded for by C. eiegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


?.Sr 


1300 


AB02452? 


Home sapiens 


basic kruppex like t actor 


1206 


100 


13 01 


X55985 


Homo sapiens 


eosinophil cationic- related 
protein 


737 


9S 


13 02 


AF007151 


Homo sapiens 


unknown 


1481 


1C0 


13 03 


X52904 


CD LJlC I U l* (I I u 

coli 


open reading zrame (AA 1-65) 


359 


ICC 


1304 


U19577 


Lt/LUcI i til id 

coli 




242 


93 


1305 


AF266508 


Nus rr.usculus 


r«lLijr protein 


1409 


97 


13 06 


Y57901 


Homo sapiens 


HTMPN-25. 


932 


10C 


13 07 


U58750 


no r h a bci \ t 
io elegans 


e a cv\> 1 *- c\ t h»"> mi I- c\r h^ncir* i fi 1 

carrier family 


365 


54 


1306 


AF044774 


Homo sapiens 


Dl coAJJUlilL Li UolCI i cy juu 

protein 2 


2681 


99 


| 1309 


AL078593 


nvmy 2: a JJ.L 


1 <K1AA06B0) 


267 


34 


3 310 


X62693 


Homo sapiens 


E48 antigen 


620 


9b 


1311 


282263 


f~*zi o r~if~*\ vH Kir - } 1 t" 

is elecans 


C4 7A4 . 1 


283 


35 


1312 


AF131218 




frame 5 


1493 


100 


1313 


Y4176 3 


Home 
sapiens 


Human PR093 8 proteir. 
sequence. 


1636 


IOC 


1 314 


AF1S6 9 72 


Homo sapiens 


JM24 protein 


2239 


100 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Home 
sapient 


Membrane-bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 


Galius 
gallu? 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallur 
galluF 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gal lur 
gallus 


SAPK interacting proteir. 


1651 


86 


1320 


X56932 


Homo sapiens 


23 kD highly basic protein 


1044 


10C 


1321 


AF17460S 


Home 
sapiens) 
>Y83Q86 
YB3086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Home 
sapienr 


F-box protein Fdx2S 


467 


7C 


1322 


M61732 


Trypanosoma 
cruzi 


neuramini dase 


214 | 2< 

i .. . 


1323 


Y17013 


porcine 
endogenous 


pol 


304 | 64 
1 
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TABLE 2 



SEQ 
IlJ 

NC: 

1 J 24" 


"accession 

N'JHsRR 


SPECIES 


DESCRIPTION 


SMITH 
WATERMAN 
SCCRI 


\ 

IDENTITY 

i 

i 


| retrovirus 






Al,] 386Sf. | Ar9bidopsis 
I inaliana 


putat lve prot f i r, 


\\ik j y 


132S 


ALl3 36bb i Arabidopsis 
\ thaliana 


putative proten 


946 


I 3:" 


132£ 


ALl3 3l>lS 1 Homo tap:enc 

■ 


bA108L7.2 (novel Lroteir. 
similar to rat t ri carboxyl a t e 
carrier} 


132; 


CO 1 " " 




AF16154 3 i Homo sapiens 


HSPC05C 


135'/ 


6c 


132f 


1/33 46 i homo sapiens 


HTRM cione 6 196 9s* protein 
sequence . 


78b 


Si 


1329 


bl 092 C | Homo sapiens 


splicing factor 




133C 
133T 

1332 


AF14 6S68 | Homo sapiens 


M3L1 protein 


193£ 


IOC 


K8 7 7 72 | Homo sapiens 

! 


Human serum glurocor t icoi de- 
regulated kinase (H- SGK2) 
polypeptide . 


232 


3f 


Y4:741 1 Homo 

! sapiens 


Human PRO704 protein 
sequence . 


1 86 0 


1C0 


1333 | AF29S096 j Homo sapiens 


zinc-finger protein ZERK: 


411 


91 


133« 


282273 


Caenorhabdit 
is eiecans 


Similarity to Mouse kmensin- 
like protein Kl 74 comes from 
this gene 


576 


4s 


1335 

~T336~ '"' 


AE00O81O 


Nethanobacte 
rium 

thermoautotr 
ophicum 


conserved protcu. 


290 




Y6877i- Homo sapiens 


Amino acid sequence of a 
human phosphor yi at ion 
effector PHSP-21 


1019 


c. - 


1337 j AJ6027003 ! Mus musculus 


protein phosphatase 


378 




133b 


U64 85fc « Caenorhabdit 
I is elegans 


weak similarity to TPK 
domains. 


215 


4C 


1335, 


AE001394 i Plasmodium 
1 falciparum 


protein of the YMR7 family 


170 


29 


1340 


X7673 7 I Homo sapiens 


MT-ll proteir. 


204 




1341 


AC011914 1 Arabidopsis 
1 thaliana 


putative mutT protein; 6839fc- 
67B81 


289 


4£- 


1342 


AJ276171 j Homo sapiens 


ASPIC 


2122 


10C 


1343 


AF1&7016 | Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC00C963 1 Homo sapiens 


similar to Kelch proteins; 
similar to D7VA77027 
(PID:g4650844) 


894 


1 3L 


134b 


AF257466 > Homo sapiens 


N-acetylneuramim c acici 
phosphcte synthasf 


1880 




1346 


Y25P96 j Homo sapiens 
i 


Human secreted protein 

f ragrrvent encoded from gene 

64 . 


1146 


100 


1347 


AJ272073 I iorpeao 

| marmorata 


male sterility protein 2-like 
protein 


1664 


5f 


1346 


AF161S4 8 j Homo sapiens 


HSPC062 


1018 


9t 


1345 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


13Sj 


G02144 


Homo sapiens 


Human secreted protein, SEQ 

tt\ \jr\ COOL. 


418 


100 


1352 


D90869 


Escherichia 
coll 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R26660_3, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 


829 


6; 


1356 


AF077226 


Homo sapiens 


copine 111 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


ZNF234 


3869 


100 j 


1361 


AL163279 


.-.omo sapiens 


homolog to cAjMP response 


5035 


39 
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TABLE ? 



SKQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 




ID 


NUMHEfc 






WATERMAN 


IDENTITY 


NO: 


i . . 




SC0RF 










element binding end beta 
transducin tamily proteins 






136; 


Z4847i 


Homo sapiens 


qiucokir.cse regulatci 


3160 


95 


1362 


Z4847i 


r Homo sapiens 


giucckinase requlatoi 


2682 


57 


1364 


AF195764 


Homo sapiens 


megakaryocyte-enhanced gene 
transcript 1 protein; MEGTl 
protein 


205? 


95 


1365 


AF'12 6 60 9 


Homo sapiens 


FR00915 


581 


100 


1366 


AF1 16605 


Homo sapiens 


PR00915 


58: 


100 


1367 


AL1173S2 


Homo sapiens 


~dj8veB10.3 (novel protein 
similar to C. elegans 
T19B10.C (Tr :Q22557) ) 


2581 


95 


1366 


Y34124 


Homo 


Human potassium channel 


1342 


100 






sapiens 


K+Hnovlb 






1369 


AJ24S6? J 


Homo sapiens 


CTL2 protein 


3728 


95 


1370 


AF008220 


Baci llus 
subtil is 


YtaG 


425 


45 


1371 


X05562 


Homo sapiens 


alpha-2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
i n codon ) 


590t 


99 


1372 


ZS804f 


Homo sapiens 


dJ4 0 8N23.4 (novel DnoJ domain 
protein) 


1296 


95 


13 73 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated, polypeptide 
1C 


1567 


69 


"13 17 S 


U53445 


Homo sapiens 


DOCl 


1645 


46 


1376 


ALII 73 3 7 


Home 
sapiens 


bA293J16.1 (zinc fmyer 
protein 33a (KOX 31) ) 


250 


60 


13 77 


AC00532t 


Homo sapiens 


R26660_l, partial CDS 


112( 


10C 


1376 


U3S11? 


Homo sapiens 


metastasis - associated gene 


1821- 


65 


1379 


L3S313 


Caenorhabdit 
is elegans 


putative 


85b 


5b 


1380 


Y2 57S6 


Homo sapiens 


Human secreted protein 
encoded trom aerie 46. 


150* 


10C 


1381 


AB03736C 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKH2N 


955 


97 


1383 


AF2 3 7 676 


Mue musculuG 


G beta- like protein GBL 


1721 


Si 


13 84 


AF2376 76 


Kus mugculus 


G beta-like protein GBL 


104? 


70 


1385 


Y5879I- 


Homo sapiens 


Human calcium regulatory 
protein CaRKG-l. 


73E 


100 


1386 


AF222162 


Homo sapiens 


ninem 


1036S 


95 


1387 


AL0316 8S 


Komo sapiens 


dJ963X23.2 (novel protein) 


337 


33 


1388 


ACO0489O 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243 80 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


S42 


86 


3389 


AF1879B9 


Homo sapiens 


zinc finger protein ZNF223 


2665 


95 


1390 


ACO3 5150 


Hcmo sapiens 


Zinc finger protein Z.NF221 


3455 


10C 


1391 


AF287894 


Homo sapiens 


PIS7 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X9C84C | Homo sapiens 


axonal transporter of 


4584 


99 








synaptic vesicles 






1394 


AF076249 


Komo sapiens 


zinc finger protein SBB1Z1 


3206 


95 


1335 


G02224 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 6 3 0t: - 


299 


75 


1396 


AC0C4809 


Arabidopsis 
thaliana 


Similar tc 


130 


34 


1398 


AF242519 


Komo sapiens 


zinc finger protein SBZF3 


181 


6c 


1395 


AL133396 


Home 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y4861I 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


95 


1401 


AC004472 


Homo sapiens 


PI. 11659_ I 


280 


5* 


1402 


X91485 


Saccharomyce 
s cerevisiae 


putative HMG bo* 


164 


27 
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TABLE 2 



"" ~"SEQ 
ID 
NC : 


ACCESSION 
KUT4PEK 


SPECIES 


: DESCRIPTION 

1 


SMJT.H- 
WATERMAN 
SC'CKE 


r " *~" 

IDENTITY 


1 4 C 3 


Y 7 9 2 1' 1 


Home 
sap 5 ens 


Human Lr«-n3ter3sc T3NSPS-14. 


2 84; 


1 00 


14 04 


Xtt 1 05* 


Mus nuscuius 


t e X 2 b 1 


1 oi ( 


99 


14 01 


AB012084 


Mus mus cuius 




194 


29 


14 0€ 


AB03021 i 


Homo sapiens 


GTS ase activating prctein 


3 23 


99 


14 0" 


AJ0105&5 


Rat tus 
ra t tus 


i l'?ii-)iKe protein 


2684 


95 


I40t 


X7576< 


Drosophila 
"nel anogoster 


LRP4 . 

_ 


364 


29 


14 OS- 


U7661 f 


Mus musculus 




804 


l — ■ 

I 48 


14 JC 


AC0OS57G 


Homo sapiens 


P2C887_l, partial CDS 


63i 


63 


141L 


AEOO0284 


Escherichia 
ecl i 


^rt , hypothetical protein 


36C 


100 


i4i; 


X0156- 


Escherichia 
coli 


LS IrolE) iaa 1-179) 


91: 


100 


i4i;- 


W7827: 


Hcmo sapiens 


Frc-.ament of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


ABC32051 


Hcmo sapiens 


crcsnic anion transporter 
OATP-t 


383; 


100 


1411 


Ml 746f 


Homo sapiens 


coaoul at ion "factor XII 


3456 


100 


141t 


AFD97S94 


Homo 
sapi ens 


L- Kynurenine/alpha- 
aminoadipate aminotransferase 


2201 


99 


T41T ~ 


|~AK151077 


Homo sapiens 


HSPC24? 


126; 


99 


14 It 


Y0994L 


Rattus 
ncrvegicus 


putative integral membrane 
transport protein 


109* 


61 


1411 


U1315I 


Mesocricetua 
ouratus 


auamne nucleot ide- binding 
protein beta 5 


2175- 


76 


142C 


ALlC245f 


Homo sapiens 


fcM65L10.S (KIAA1176 (novei 
protein, presumed ortholog 
of nojse K-Cl cotranspcrter 
KCC2 } J 


S69< 


100 


142: 


Y9542f 


Homo sapiens 


Human PRO1604 (UNO/785) amino 
acid sequence SEQ ID N0:30B. 


is: 


29 


242i 


Y9422I- 


Hcmo sapiens 


Human secreted protein clone 
qs34_3 protein sequence SEQ 

ID NO; SI . 


4039 


99 

\ 


"1423 


AF177386 


Homo 
sapiens 


csr.cer - amplified 
transcriptional coactivator 

asc- : 


1074t 


59 


1424 


Y48S1* 


Homo sapiens 


human breast tumour - 
associated protein 62. 


1851 


99 


2421 


AF20684E 


Homo sapiens 


BM-001 


14S4 


89 


1426 


AF2C8846 


Homo sapiens 


BM-ooe 


853 


79 


142'/ 


AF112886 


Bos tourut 


di : i erenti ation enhancins 
factor j 


4 6 9-* 


95 


1426 


U4 1387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF16J534 


Homo sapiens 


HSPC04f^ 


2853 


78 


1430 


AF125043 


Mus musculus 


bif phosphate 3 * -nucleotidase 


275 


30 


1433 


Y6 671E« 


Homo 
sapiens 


Membrane-oound protein 
PKCl IOC . 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recocnition molecule 
Caspr/ 


568 


100 


14 33 


AB0G4560 


Mus musculus 


T— 

Gu a col it. 


192 


34 


1434 


R9990C 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 


707 


51 


143b 


AF220S3 0 


Homo sapiens 


royo- inositol l~phosphate 
synthase A3 


2904 


100 


1436 


X70944 


Homo sapiens 


PTE; -associated splicing 
factor 


1261 


72 


1437 


A7271732 


Homo sapiens 


br:cgmg lntegr;ator-3 


1282 j 


100 


143b 


Y30BU 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


S95 


98 


1439 


A0293659 


Homo sapiens 


mucol lpidin 


626 


97 


1440 


AF219128 


Homo sapiens 


GGA3 long isoforro 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 



180 
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TABLE 2 



NO: 
14 4". 


ACCESSION 
NUHEER 


HPECU::; DESCRIPTION 1 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 1 
1 


AB035669 


hcmo S3p3ens ; ALtiX- 


1944 


100 | 


144 : 


AF237711 


Ei osophiie i Iuabic 
melanogaster j 


is: 


1 


1444 


AJOllfcSG 


Homo sapiens j Kef3 oeta protein | 43S 


3S I 


144: 


X73874 


Hcmo sapiens | phosphoryl ase kinase | 623? 


96 | 


144f 


Ar 2 14 124 


homo sapiens | breast c;n mnorr.a- asv ociat oc 1 3991 
i antigen FCAA 


95 j 


14 4'/ 


AF003924 


Homo sapiens \ AN C_ 2 HO 3 


2645 


9? \ 


1441 


AF0G3136 


CeenorhebdiC 
is elegans 


contains wea< similar, ity tc 
an AMP-binding motif 


264L- 


^ 1 
I 


1445 


AF1S5112 


Hcmo sapiens 


NY- REN- 50 antigen 


1184 


85 


145C 


Y9SO04 


Hcmo sapiens 


Human secreted protein 
vcB4 1, SEQ ID N0:4fc. 


S8b 


100 

i 


1451 1 AF1Q7203 1 


Homo sapiens 


ataxin 2-binding protein 


68fc 




1452 


AF107203 


Hcmo sapiens 


atoxin 2-bindlng protein 


4S( 


78 


1453 ] 


Z38013 


Mus mascuius 


LMK-N5 


88: 


5€ 


14Ss 


X90566 


Hcmo oaoiens 


Protein aeauencc anc 
ennotaticn available soon via 
LABE1T@EMEL- Heidelberg .DE 


51C 


26 


14Si 


AL035409 


Homo sapiens 


dOS64M11.3 (similar to 
sialyl t ran* erase) 


135* 


100 


14S6 


D4448C 


Mu r mu s c u 1 u s 


ttATH-2 protein 


272 


300 


145F 


AF141326 


Homo sapiens 


RNA hcl lease HDB/DICE: 


4 76 


45 


14S5 


AF242S52 


Gcli 1 U5 

gallus 


X tr V- 1 ll\J v £ Jl 


945: 


34 


14 6C 


U11036 


ri v j ■ i j a y j -* irijo 


3bd3 


724 


8<; 


1463 


AB02B258 


Mur musculus 


gianuphilin-a 


54b 


35 


146'* 


Y08134 




cJC J U 11 Ii^+KJlliy y- X 1 lid 1 Ji * 

phosphodiest erase 


2426 


1 95 


146:- 


AC004997 


Homo sapiens 


match to ESTs Z43S75 
(NID:gS73097 } , R19655V 
(NID:g774333 ) 


869 


98 


1464 


AC0049S7 


Homo sapiens 


match to ESTs £43975 
<NID:gS73097) , R19691; 
(NID:g774333) 


865 


98 


146* 


U32743 


Haemophilus 

influenzae 

Rd 


fucose opexon protein (fucUj 


31b 


50 


146* , Y09G22 


Hcmo sapiens 


Notbe-like protein 


2341 


100 


146% 1 RC003034 


Hcmo sapiens 


Homolog ot rat kidney - 
specific (KS) gene 


107i 


95 


1466 j AF071S44 


Spmacia 
oleracea 
( 


ribvlose-l , S- bisphosphate 
carboxylase/oxygenase small 
subunit N- met hyl transferase 1 


333 


2e 

* 


146& .; Y5793G 


Homo sapiens 


Human transmembrane protein 
HTMPN-S4. 


1053 


100 


147C ; AF0326G6 

i 


Rattus 
norvegicus 


rsec5 


4504 


93 

f 


147: J Y70467 
i 


Hcmo sapiens 


Human membrane channel 
protein- 17 (MECHP-17, . 


452 


74 


147; 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1472 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTS1 


1103 


50 


147b 


Y86241 


Homo sapieno 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJO 10317 


Fuuu 

rubrlpes 


Sand 


1276 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded for by C. elecans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157> 


84C 


44 


| 1476 


X62447 


Homo sapiens 


PR 264 


543 


61 


1475 


X8220S 


Homo sapiens 


mk: 


7116 


100 


1480 


U10S36 


Pan paniscus 


roHC ciace l a 


675 


84 
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TABLE 2 



ID 
NO: 


ACCESSION 


SPECIES 




SMITH- 
WATERMAN 
SC0R£ 


IDENTITY ' 

■ 




AL07 8 599 | r.cmo vi. pi ens 


dJ9$lC6.l (novei protean 
similar to C. eleQcfH 
FS5A12 .9 (Tr:P91066) s 


1274" 1 6: ' , 


1 1482 


Z56S77 


Schizci cC"ha 
2 otnyL t : 
pombt 


putative vacuolar proten. 


256 | 21 
1 


1483 


AB005662 


Mus na:i cuius 


J}JK/£aPK- associated protean- 1 


4968 | 9i t 


148C 


AL05C12C 


Homo capiens 


hypctheti cal protein 


716 | 10( 


1485 


K2787G 


Homo Lepiens 


DMA binding proteir. 


1006 




34 86 


Y6916: 


'Homo sapiens 


Amino acid sequence ol £ 
partial protein kinase 


"575 


9S 


| 3 4 87 


X841S6 


Sacchai o;riyce 
£ cerevxsiae 


ATHl 


341 


2S 

I 


[ 34 88 


AK03 6963 


Homo .'cijiens 




446 \ 34 I 


| "* 4 89 

1 — - — — 1 


U56966 


caenor t-.abdlt. 

*• <t plprt;, nc 


rr>rioi^ f at W\ f r~* pi trifle fDNIi 

yr<3 0fc3.5; coded for by C. 

CT X C?U cM | J> LJV*n y Iv J . .. 


620 


t 42 


1 1 4 90 

1 


AE000989 


ch&t"Cc 1 obo 
s fulc^cus 


Anrwtl _ X k<|J v n t- r\ ti tr ( fart - 4 \ 

cnoyi *loa nyur a t cibe iidu-sj 


533 


46 


j 1491 


nouv j o 


HUi, VCy J C U o 


adeaylyl cyclase type IV 


707 


| 3t 


I 1492 


Y733 42 


numu c j; i c i jo 


riiXM cjonc / / u^usb protein 
sequence . 


3513 


99 


1 14 93 


Y17220 


Homo Pcueno 


Human secreted protein (clone 
f j2B3-ll) . 


4G2 


37 


14 94 


AVC 1. Jy t \J 


— — 

rius tnu 5" C! 11 L US 


ARLi - 6 interacting protein-2 


701 


P97 t 


14 95 


Y94 8 97 


sap i en- 


Human protein eione HP10574. 


1371 


1 Of. 


| 1496 


AL049699 


Komo Eep.. enc 


dJ747H23.2 (novel protein! 


1550 


100 


| 149? 


/\rUJ f H *m f 


Homo Sc j^* > ens 


riboscmal S6 protein kinase 


2427 


IOC 1 


( 1498 

L 


AL445067 


The r mop 1 asraa 

~ ^ /3 /~vo >- i 'i 1 1 Tn 
C X klvJ^J. . J. J Mill 


putatjve target YPH207w ci 
the HAP 2 transcriptional 
complex related protein 


2 69 




I .149? 


AB03994 7 


Homo SeDi cn9 


X 1 1 L- fci noi ng protein 53 


227 


L 3fc 1 

100 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3 509 


ibUl 


AU050333 


sapien: 


Qoy3i\44 * ± \ no ve j pxoi.ciij 
(contains DKFZP564B11 6) ) 


2439 


100 




AF17 96 96 


Homo Sep i ens 




1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2s 


1177 


100 


1 Crti 

-I ra 

t 


Y53005 


Homo sapo ens 


Human secreted protein clone 
pn74 9_a protein sequence SEQ 

lt> NO -16 


1442 


99 


1505 


X82494 


Komo sapiens 


f ibulm-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiguitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJll03G7.6 (novel protein) 


1099 


100 


1 508 


Y76144 


Homo sapiens 


Human secreted proteir. 
encoded by gene 2 1 . 


1736 


100 


1 509 


AF2201B2 


Homo sapiens 


uncha racter i zed hypothalamus 
protein HT008 


1181 


98 


1510 


U64601 


Cacnornebdi t 
is eleasns 


Gene probably begins in the 
next cosmid 


415 


56 


1511 


AL356192 


Neurospcra 
crassa 


related to MDMl protein 


196 


2b 


1512 


D17629 


Homo 
sapienr 


N-acetylgalactosamme 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1"513 


AF168717 


Komo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 ■ 


100 


1515 


AC003672 


AraMdop? is 
thalian* 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


! 1517 


AF0033 40 


Caenorhobdit 
Is elecsns 


C4 4E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
ncrvegicos 


beta- alanine-pyruvate 
aminotransferase 


2238 


62 


j 151S" ■ 


AJL121764 ■ 


Schizosaccha 


yeast atpl2 protein precursor 


270 


30 
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TABLE 2 



SF-0 


ACCESSION 


SPECIES 


tESCRIPTlON 


SMITH - 


< 


ij: 


NUMBER 






WATEKMAN 




) NO : 








SCORE 








romyce.* 
po-nbc 


homoloc 






pi* 5 2 Cl 


Zi V 0 c. ^ Q 1 p 
r\f 4 Z> J -/ -1. \j 


Home 
sapiens 


— z = — : — 1 — — — — 

VaGcui ci i endotnel i a - 

junction- associated molecule 


54 7 


■-r-^ 


! - z> 1 i 


D3 1764 


Home sapiens 


VT Ivli ft ft A L 


170 




| 1522' 


Y6 6 634 


sapiens 


ncnuji due ucuno ^(ulciii 
PRO190 . 


985 


j (*»(. 


1 1 col 


Y 9 4 4 5 0 


Homo 6o p l SHE 


Human inf lamination o3soci&ted 

nyA,^ pin 
piOC til) 


2 5C 




15 2 4 




n 1 qui U^pblo 

thalianc. 


F17F8 22 


277 


3 '. 


1525 


AF109377 


Mus museulus 


ldlBp 


1277 


e:- | 


1526 




Hoit>o sapiens 


dJ167A19.4 (novel protein) 


i /no 

1 4 S> Z 


? 


1S27 


Y08135 


Mus museulus 


acid sphmgoniyei inase-like 
phosphod lesterese 


1496 


7. 1 


I 1528 


AK024423 


Homo sapiens 


FLJ0G012 proteii. 


611 


ICC 


| 1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptida3e 


679 


IOC: 


1530 


AF205596 


Homo sapiens 


transposase- 1 ike protein 


1368 


. 

1C( 


1531 


AF2S1039 


Homo sapiens 


putative sine finger protean 


1420 


5< 


1532 


W74 805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOE AS 2 4 . 


4 93 


5*/ 


1 1533 


AFC39023 


Homo sapiens 


Ran-GTP binding protein; 
RanBP€ 


5707 


95 


1534 


AC007190 


Arabidopsis 
thalianc 


F23N19.S 


374 




1535 


AE027564 


Homo sapiens 


DINB1 


4482 


10C 


1536 


Y3G178 


Homo sapiens 


Human secreted protean 


377 


ei 


1537 


Y50907 


Homo sapiens 


Human fetal brain CDNA clone 
vb3_l derived protein. 


3593 




1536 


AF017368 


Mus mus cuius 


faciogenital aysplasia 
protein 2 


177 


4'/ 


1539 


AF266756 


Homo sapienc 


cphingoame kinase 


2011 


9!r 


1S4C 


240604 


Homo sapiens 


OA1 


2238 


10C 


1541 


AF000195 


Caenorhabdi t 
is elegans 


Contains similarity to Ptam 
domain: PFD0169 (PH) , 
Score=20.6, E-va)ue=l .9e-05, 
N=l 


379 


4> 


1542 


Y7115S 


Homo sapiens 


Human phosphocJ e sl erase 
interacting protein, 
myomegalin. 


9415 


9r 


1543 


X76092 


Homo sapiens 


DNA binding protein kfjij 


3 327 




1544 


AB015330 


Homo sapiens 


HRIHFB2 007 


631 


50 


1 54 5 


Ar 1 y oh o J 


Homo sapiens 


t ranacript x on lector LDP-lb 


2 822 


1 Ul 


1546 


AF016417 


Caenorhebdit 
is elegans 


Similar to BZ1P transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL recept or 


1 *\ {\C. 
a 1 Ub 


1 uv 


154 8 


kQAl XL A 


auratuf 


ubiquit in- act i va t ino enzyme 
El 


a i c 
o J b 


4 


1 54 9 


/MjUZI / U / 


Homo sapiens 


QujUoilO.4 CKJl/vHUbDO / 


j boa 


10' 


1550 

1 


AJ223978 


Bacillur 

enlit" ) 1 i c 


YvqK protein 


292 


4? 


j 1551 


AF145615 


Drosophila 

mo 1 a y"» n c I- r\ v 
luci O Ill_>y <J bLCi 


BcDNA.GH03 377 


822 


44 


| 1552 

i 

t 

i 


AL157734 


Schi zesaccha 

romyces 

pombe 


putative mannosyl transterastf 
involved in N-glycosylation 


4 35 


37 


1553 


AFQ79527 


Mus musculus 


IER5 


691 


63 


1554 


AB026291 


Rat Cus 
norvegicus 


acetoacetyl -CoA synthetase 


1099 


ee 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


9Sr 


1556 


AF116553 


Drosophi is 
melanogaster 


ant ennal- spec i tic short -chain , 
dehydrogenase/reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 
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TABLE 2 



SEQ 
ID 

NO : 


ACCESSION 1 SPECIES 

rruMBEK ! 

t 


DESCRIPTION 


smitj-- 

SCOP" 


i dent; tv 






UIULClli, I'll Xr ^ . 






1556 


Y71C5C j Heme sapiens 


Human membrnne transport 
protein, MTSP-l 


1S75 


c c 


1 559 


Y7105C j Heme sapiens 


Human menvbrone transport 
protein, MTSP-i. 


! 1894 


c *7 


1 560 


AF092050 { Muf nusculus 

i 


t =k _ 1 "} - TO 
K/H L d ~ i. , _J i ' - 


262 


44 


1561 


AL109627 


Homo sapiens 


dJ309X20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (£?AG4) ) ■ 


1607 


S? 


1 562 


AJi3ieyo 


Homo sapiens 


DNA polymerase i&mbda 


3002 ! 100 


1S63 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


I 100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubi(jLiitin binding enzyme 


2790 


100 


1 565 


AC00S306 


Homo sapiens 




919 


ei 


1566 


AF00019S 


Cacnorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PrODlbS (PHJ , 
Score-20.6, E-value=l . 9e- 05, 
N=l 


550 


45 


1567 


AB033281 


Hocr.c 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D49473 


Mug musculuB 


truncated form of Soxl7 


1047 


7* 


1569 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


9S 


2571 


AK145713 


Homo sapiens 


SCHIP-1 


2388 


100 


1572 


AE003831 


Drocophi 1 a 
meior.ocjastcr 


CG164 4S gene product. 


180 


31 


1573 
1574 


AF074603 


Strep tomyces 
griseus 
subsp . 
griseus 


NoriF 


205 


38 


U2 8 993 


Caenorhabdit 
is elegano 


F22D3.3 gene product 


144 


27 


1575 


AF1 2 9507 


Home sapiens 


transcription factor ICBP90 


2 87 


66 


1576 


X64 678 


Home sapiens 


oxytocin receptor 


2002 


100 


1577 


AF23 771 1 


Drosophila 
melanogaster 


Diablo 


4 21 


54 


1578 




Homo cap i ens 


Human secreted protein, SEC 
ID NO: 5056. 


A ft ft 




157 9 


i\t c. H O t H *t 


Crypt ospori d 


thrombospondin- related 
adhesive protein 


123 


33 


15 8 0 




Homo canS pna 


aJ585U4.2 (novel protein 
l r z&ns xa ion or cur*/* 

Z* 111 » * * v 17/ J 


663 


10C 


15 81 


AF041853 


Homo sapiens 


|r i npcS n f Ami "1 v Tn^mH**T nrn^pin 
Mucpin i. auu j y iiicmi>Ci yt*. ^ Jill 

KIF3A 


345 


33 


1582 


AF025443 


Homo sapiens 


Opa- interact 3 ng protein 01P5 


1198 


100 


1583 


AE001603 


Thermotoga 
marit iraa 


glyecrate Kinase^ putative 


349 


34 


1584 


AF2S2282 


Homo sapiens 


Kel ch- 1 i)te 1 protein 


3973 


100 


1585 


AF1 69675 


Home 
sapiens 


ipurlnp-rlrh rer>eat 
transmembrane protein FLRT1 


34 94 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP-t-- dependent malic enzyme 


3167 


99 


1568 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprot ein b54b5/i 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schi zosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 





ACCESSION 


SPECIES 


DF.SCR2 PT10N 


SMITH- 




ID 


NUMfc Eh 






WATERMAN 


IDENTITY 


NO-. 








SCORE 








pomb* 




i 


1S9S 


W7832<. 


Homo eapiens 


Frogment of nurr.&n secreted 


131^ 


SI 








protein encoded by acne 81. 






1 15 96 


Y9490C 


Homo sapaens 


Human secreted proiem c-cne 


2 236 ! 


9b 








rb6 4 9_3 protein sequence SEQ 












ID NO: 1 8 






1597 


AF174605 I Homo sapiens 


F-box protein Fbx2 5 


1408 


99 


1598 


AB032254 


Home 


bromodoniain adjacent to zinc 


3676 


98 






sapiens 


finger dome in 2A 






1599 


X73114 


Homo sapiens 


Slow MyBF- C 


5568 


95 


1600 


X82200 


Komo sapiens 


gpStafSC 


2305 


1 OC 


1601 


Y00876 


Home 


Human LAPH-1 protein 


1149 


98 






sapiens 


sequence . 






1602 


AJ223351 


Homo sapiens 


HIRA- inter act ing protean 3 


2B21 


99 


1603 


AJ22280: 


Hemic sapiens 


neutral spnmgomyel inase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


S9 


1606 


AF185576 


Mus mvsculua 


PCZ/zinc finger transcription 


3435 


97 








factor OCA- 8 






1606 


AF093744 


Homo sapiens 


unknown 


1 3D 


100 


1607 


A12142 


synthet ic 


I FN -pseudo -omega 1 


80C 


Si 






construct 








1608 


Y57949 


Homo sapiens 


Human transmembrane protein 


186E 


100 








H7MPN-73 . 






1609 


AF151044 


Homo sapiens 


HSPC210 


681 


\ 97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


10C 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyj 


2976 


100 








transferase 






1612 


AF22056C 


Homo sapiens 


B/K proteu. 


2486 


9S 


1613 


AC004481 


Arabidopsis 


nodulin-like protein 


3 71 


26 






thaliana 








1614 


Y09501 


Homo sapiens 


NADK-cyt ocnr ome-bB reductase 


1607 


10C 


161S 


Y1552: 


Homo sapiens 


start position ; 


3150 


97 


1616 


AJ010750 


Rat tut 


Castration induced prostatic 


890 


6i 






norvegicus 


apoptosis related protein- 1, 












(C3 PAR-1) 






1617 


XS8079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y66676 


Home 


Membrane -bound protein 


9G7 


100 






sapiens 


PRO1009. 






•619 


AJ242973 


Horno sapiens 


peptide methionine sulioxide 


529 


IOC 








reductase 






1620 


AF1S0733 


Homo sapiens 


AO-014 protein 


288 


100 


1621 


AJ00750S 


Homo sapiens 


ElB-£5kDa-&Esociated protein 


4646 


96 | 


~T622~"~ 


X64177 


Homo sepiens 


metal lot hicnen: 


380 


100 


1623 


AE001045 


Archaeoglobu 


A. fulgidus predicted coding 


240 


36 






s fulgidus 


region AF0859 






1624 


AJL355013 


Schizosaccha 


mitochondrial carrier protein 


4 03 


34 






romyces 












pombe 








1625 


Y66746 


Homo 


Membrane -oound protein 


1184 


100 






sapiens 


PR01198. 






1626 


D900S3 


Sus scrofa 


destrin 


863 


1 00 


1627 


Y359S4 


Homo sapiens 


Extended nutrvan secreted 


756 


1 fin 
1UU 








protein sequence, SEQ ID NO. 










i . ..... 


203 . 






162B 


AL031775 


Homo sapiens 


dJ3 0M3.2 i novel protein) 


470 | 


1DO 


1629 


AF132484 


Mus musculus 


unknown 


266 


68 


1630 


AF017096 


Drosophila 


similar to C. eleoans 


4 93 


61 






melanogaster 


R10H10.6 and S. cerevisiae 












YD84l9-03c 






1631 


X03077 


Homo sapiens 


lactate denyorogenase-A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidcpsis 


Contains weak, similarity to 


143 


36 






thaliana 


GATA-6 DNA- binding protein 












gb|H36135, gb|Z26200 come 












from this gene. 







185 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SK-TH- 
SCORE 


IDENTITY 


1635 
~163 6 


AKU2624C- 


he no sapiens 


HERV-E :ntegi£st 


41j 


90 




Homo sapiens 


Human acuir brain cDNA clone 
ve8_l cerived protein. 


il2i 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


206t 


99 


1638 


AJ236247 


Kus musculus } putative phosphatase subunit 


194* 


96 


1639 


Y94S4: 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEO 

id noTdc 


132C- 


100 


1640 


AF23S030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233286 


rrosopliila 
melanogaster 


WDS 


356 


26 


1642 


Ml 92 5: 


N3us museulus 


immunoglobulin heavy chain 
binding proteir. 


145 


34 


1643 


Y704^'. 


Homo sapiens 


Human men-Jar ane channel 
protein-* (MECHP-2) . 


135. 


100 


1644 


AF176 520 


Mus mus cuius 


U"D repeat- containing F-box 
protein FBWS 


267fc 


86 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


115t 


100 


1646 


X67li^ 


Homo sapiens 


mitotic klnase-like protein-1 


445< 


99 


1647 


M6 318C 


Homo sapiens 


threonyl - t RNA synthetase 


104C 


61 




Y87341 


Homo sapiens 


Human eignal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


156t 


93 


164S 


R9533; 


Homo sapiens 


Tumor necrosis lactor 
receptor 1 death domain 
ligand (clone 3TW) . 


413 *< 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


056 


99 


1651 


AB015346 


Homo sapiens 


EpslSfc 


4464 


99 


1652 


AL163S/6 


Arabidopsis 
thaliana 


putative proteir. 


1341 


48 


1653 


AC00531? 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031426 


Homo sapiens 


dJl8 4J9.1 (KIAA06C1 protein > 


3526 


100 


1655 


AL031426 


Homo sapiens 


dJ164J9.1 (KIAA0601 protein) 


352fc 


100 


1656 


AB017910 


Drctyosteliu 
in discoideum 


mycM 


297 


32 


16b7 


Y28S1P 


Home 
sapiens 


Human regulatory protein 
HRGP-S. 


2253 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiqrui tin- specific protease 


137 


35 


1660 


AL07B62V 


Schizosaccha 

romyees 

pombe 


actm-like protein; 12 actin 
domains) 


320 


34 


1662 


X52022 


Komo sapiens 


collagen type VI, alpha 3 
chain 


16270 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1813 


100 


1664 


AF21473t 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharotnyce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1666 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halie 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


P4 0 


3 97 


43 



386 
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TABLE 2 



SEC "I 
IB | 
NO. 1 


ACCESSION j 5FF.c:iES 
NUME£K j 

i 


Hi SCRIPT: ON 


SMITH- 
WATiRMAN 
SCORE 


\ 

IDENTITY 








I6fc9 | 


??97S.- 


"Schrzosaccha [putative- MOL1 -NOP2 - sun family 
rcrnyces j r.uclcoiar proteii. 

pcmbe ! 


56<- 


47 


167C | C0333C 
1 


homo sapiens J rumen secreted procein, SEQ 
| ID NO: 7211 . 


42? 


97 


1671 


t^662E 


gal )us 


cardiac muscle tensm 


1185: 


S4 


167? 


API 7448? 


Homo sapiens 


poJycomb ': 


20C: 


99 


1673 


Vr 184 6 


Homo sapiens 


Human 18.1 homciog protein 
I racmeni . 


233 


29 


1674 


AF2 5S334 


Homo sapiens 


EXP3f: 


15; 


2S j 


1675 


YS-4 36 7 


P.c:mo 
sapiens 


Human protein clone HP10563 . 


lot 


30 


i676 


Y25712 


Hoa.o sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


158t 


91 


1678 


AF1631f>i 


Homo sapiens 


dentin sir.lophosphcprotein 
pr ecursoi 


170 


17 


1679 


AF3 63151 


Homo sapiens 


dentin siaiophosphoprctein 
precursor 


170 


17 


1680 


AKC244b3 


Homo sapiens 


FLO00D45 protein 


134S 


100 


1 681 


AFO 19236 


Di ctyos tel iu 
m diocoideura 


TipC 


61? 


34 


1692 


A." 2 4 345? 


Lcishmania 
major 


prot eophoi-phoglycan 


15? 


26 


1683 


ZC9365 


Scnizosaccha 

romyccs 

ponibe 


putative C-TP- binding protein 


"56 0 ■ 


46 


i 68C 


>^4910 


Ho:no sapiens 


ERp2 1 


1334 


100 


1685 


AF286475 


Taki f i;gu 
rubripes 


retinitis pigmentosa GIVase 
reguiator-HKe protein 


19f 


19 


1686 


AFl 91298 


Homo sapiens 


vacuolar sorting protein 35 


4 08'/ 


100 


1687 


Aa 2 7b 986 


Hcmo .sapiens 


transcription factor 


295t 


100 


icee 


AC^75986 


Homo eapicna 


transcription factor 


188d 


ae 


1689 


X07311 


Droscphila 
meianogaster 


heat shock protein 


138 


43 


1690 


A? 24 0463 


Rattur 
norvenicus 


LI Si- interacting protein 
NUDE: 


1383 


83 


U9I 


AC 2 7 2 074? 


Homo Copiens 


APOBEC-1 otimulatir.g protein 


1256 


68 . 


1692 


A02 72O7 9 


Homo se.piens 


A?OBEC-l stimulating protein 


1336 


60 


1693 


API 77 942 


Xenopus 
laevis 


katanin p6C 


1664 


66 


3 6 94 


AP2 63539 


Homo sapiens 


arginine M-methyltxansf erase 


1774 


100 


369S 


AF^226B9 


Homo 
sapi ens 


protein arginine N- 

nethyl transferase 1- variant 2 


1182 


8i ; 

i 


1696 


AK0O0193 


Homo sapiens 


unnamed protein procuct 


106C 


100 ] 


3697 


AP041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


3122 


100 


3698 


AE041035 


Homo Gapiens 


kidney euperoxide- producing 
NADPH oxidase 


2161 


100 i 

1 

I 


3699 


AF025772 


Komo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


Y44676 


Hcmo sapiens 


Human ARF-Related Protein-1 
(HARP-1) . 


938 


97 


1701 


AXC22407 


Hono sapiens 


unnamed protein product 


315 


98 


1702 


AEC24574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AFC55078 


Hcmo sapiens 


2inc finger protein 42 


421 


52 


1704 


AF1S8092 


tfus nusculus 


KF42 


1057 


77 1 


1705 


AE003573 


Droeophila 
meianogaster 


CGI 24 74 gene product 


161 


33 


170 6 


ACC36345 


Drosuphila 
meianogaster 


aquaporm 


164 


24 


1707 


Y5S927 


Homo sapiens 


Hunan STLK2 protein. 


2146 


| 100 


170 8 


U27121 


Danio rerio 


Gi2 


212 


47 


"17 09 


AL391710 


Arcbidopsis 


putative protein 


505 


rso~ 
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TABLE 1 



SEC 
ID 

KO: 


ACCESSION 
NUMBER 


SPECIES 


UESCRZPTION 


SM1TK- 
WATERMAN 
SCORE' 


IDENTITY ! 






t hal :ane 








17 1C 


B0131: 


Homo sapiens 


Human PRC241 poiy-peptide . 


164! 


97 i 


nil 


U4075C 


Mus musculus 


fornun binding protean 3C 


r4 5'6j 


1 


1712 


AJOlillfc 


Mus musculus 


SKCietal muscle and cardiac 
protein 


149t 


89 , 

1 


1713 


AF2S5303 


Home 
sapiens 


tnerr±>rane-associ ated nucieic 
acid binding protein 


44U 


9S j 
1 


1714 


AF2 5 53 03 


Home 
sapiens 


membrane -associ ated nucleic 
acid binding protein 


2.96 C 


100 j 

1 


1715 


UC8227 


Rnttus 
norvegicus 


Ras-related protean 


511 


51 ! 


1716 


AF168795 


Rat tus 
norvegicus 


schi af en-4 


•125- 


44 


1717 


AF196304 


Homo sapiens 


SUMO-l-specif ic proteast 


5804 


99 


i7ie 


AL35S737 


Homo sapiens 


HXG2 0A 


1782 


IOC 


1715 


ABC29333 


Halocynthia 
rorctzi 


HrPET-1 


1 1065 
1 


46 


1720 


AF071317 


Mus musculus 


COPS complex su burnt 7t. 


fl297 


97 


1721 


AJ27221S 


Homo sapiens 


HEYL protein 


1681 


99 


1722 


GDI 96; 


Homo sapiens 


Humor, secreted protein, SEC- 
ID NO: 6063. 


718 


IOC 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacterized 
protein family UPF0C34 , 


825 


41 


1724 


G01972 


Homo sapiens 


Hu;nan secreted protein, SEC 
ID NO: 6053. 


586 


92 l 

i 


1725 


Y94C4J 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 ] 


1726 


Al'255443 


Homo sapiens 


CGI - 2 01 protein 


4 3 97 


99 


1727 


AF1U3426 


Homo sapiens 


HT004 protein 


1810 


99 ] 


1728 


D3G884 


Bos taurus ' 


heurccal cin 


1002 


99 | 


1729 


Z-HS29 


Go i 1 ue 
gal lus 


tcnflin 


1411 


84 


1730 


Z73423 


Caenorhabdi t 
is elegans 


cDNA EST EMBL:ZI4 908 comes 
trom this gene-cDNA EST this 
gene 


233 


41 


3732 


AF0i0891 


Homo sapiens 


PRO03 05 


470 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylose 8 


2015 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131 . 


503 


95 


1735 


D46513 


Mus musculus 


leucine- rich-repeat protein j 3S31 


94 


1736 


AF096709 


Drosophi la 
virilie 


failed axon connections | 27 6 
protein > 


32 


1737 


AF195320 


Homo sapiens 


aynactin p62 subunit | 2417 


99 


1738 


L35314 


Caenorhabdi t 
is eleganc 


contains similarity to Piam | 206 . 
family PF01772 N=l i 


37 


1739 


XS4618 


Listeria 

monocytogene 

c 


phosphadidyl inositol specific j 134 
phospholipaae C j 


27 


1740 


AL031658 


Homo sapiens 


dJ3iOOl3.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y36924 


Homo sapiens 


Extended human secreted 
protein sequencer SEQ ID NO. 
173 . 


1013 


99 


1742 


AC0133 54 


Arab! dopsis 
thai iana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding proteir. 
APD08 . 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RslGPSlA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO143 0 (UNQ73 6) ammo 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A- utilising 


842 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMSER 




DESCRY ; Tier; 


SMITH- 
KATERMAN 
SCORE 


r- 

IDENTITY 






enzyme CoAEN-2 . 






1746 


AK024436 


Homo sapiens 


FLJC0O26 prote.ji 


161? 


300 t 


174S- 


AE000877 


Methfrnobacte 
rium 

t.hermoautotr 
ophicutr. 


conserved protei: 


231 


36 . 

i 


17S0 


AF101361 


Drcscphila 
meUnocaster 


Abnormal X segregation 


193 


i 


1751 


V15067 


Homo sapiens 


2NF232 


889 


io(, ; 


1752 


AF25103 8 


Homo sapiens 


GAP- like proteii: 


822 


loo ~i 


1753 


AC003093 


Homo sapiens 


oxysterol- binding PROTEIN; 

45% similarity tc P220SS 
(PID:gl29308) 


352 


57 

1 


1754 


X630B9 


Homo sapitns 


165kD protein 


5703 




1755 


AL0497S5 


Homo sapiens 


OJ622L5.3 (novel protein) 


1039 


ICO 


1756 


AL0313S3 


Homo sapiens 


dJ733D!5.1 (Zinc- * ir.ger 
protein) 


2765 


10 0 


1757 


AB040672 


Homo sapiene 


UDP-GalNAc: polypeptide K- 
acetylgaloctosami uy3 transf era 
se 


2020 


95 


1758 


AL022236 


Homo sapiens 


dJlC42K10.4 (novei pro:eir,] 


776 


43 


1755 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


i76"o 


Y12065 


Homo sapiens 


hNopS6 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleojcr protein 
hNop56) 


2595 


9S= 


17 62 




Homo 
sapiens 


Gene product wiih ^amilarity 
to dynein beta euounit 


1542 


51 


17 6 3 




Homo sapiens 


formiminotransf eta.'< 
cvfl ndf»a m i n^*t*» 

jr V- \.J \A\^\\\\\ 1. 1 iO^>V^ 


8 77 


10C i 

j 
1 


1764 


U9154 1 


Homo sapiens 


human £ ormi mi not i r. ns tcrase- 
cyclodeaminase (f tcd)protein, 
carboxy- terminal enc 


596 


IOC- 

1 

j 


1765 


AB013365 


Ba c i U u s 
halodurans 


YlgF 


350 


34 t 


1766 


V38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1767 


AC009176 


Arabidcpsis 
t.haliana. 


putative ribuJose- 3 , 5- 
bisphosphate 

carboxylase /oxygenase small 
subunit N- methyl transf erase I 


216 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


73 7 


99 


1769 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Homo sapiens 


AMSH 


1214 


56 


1771 


U8943 5 


Kus musculus 


unknown 


829 


86 


1772 


S70011 


Rattus sp. 


tricarboxylate career 


1604 


95 


1773 


AL03SO86 


Homo sapiens 


dJ44A20.2 (novel protein} 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PR016 04 (UNQ785) amino 
acid sequence SEO W0:308. 


1057 


99 


1775 


AF110330 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ26952 9 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


10C 


1777 


Z81579 


Caenorhabdit 
is elegans 


cDNA EST yk7 6£l.b comes trom 
this gene 


232 


3D 


1778 


AY007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romycee 

pembe 


oxysierol -binding protein 
family 


644 


38 


1780 


AF254260 


Hcmo sapiens 


tuftelin l 


1729 


100 


1781 


L07924 


Mus mus cuius 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


45. 


1783 


AK024475 


Homo sapiens 


FLJ00O68 protein 


4333 


100 


1784 


AX024475 


Kotr.o sapiens 


FLJ00O68 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014. 


570 


100 


1786 


"S82637 


Homo sapiens 


Ig lambda-like gene/beta- 


247 


100 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- ; ~~h 


ID 


NUMBER 






HATERMAK j IDENTITY 


NO: 








SCORE 








glucuronidase ex on 11 hcn^olog 
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TABLE 



SEQ ID NO: 


ACCESS! Of' 
NC 


DESCRIPTION 


RESULTS* I 


2 


RLQ024C 


Receptor tyrosine Kinase 
class III proteins. 


BLOC240B 24.70 8.250e- 
12 3S7-181 : 




FKOOlCi- 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17 . 04 8.065e- 
13 358-381 

i 


L 


BL0002c 


2inc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.40Ce- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


C 


BL0002> 


Type II fibronectiTi 
collagen-binding domain 
vtTot fins 


BL00O23 24.31 8.920e- 
33 413-450 BL00023 . 
24.31 4.545e-27 353- ' 
390 


I 


RL0O021- 


Type II £ibronectin 
collagen-binding dorr^nn 
proteins . 


BLO0C23 24.31 8.920c- 
33 413-450 BL00023 
24.31 4.54Se-27 353- 
390 ' 




ISLjVU \J £ Z 


'l*ype 11 £iforonectm 
collagen-binding domain 
prot ei ns . 


33 413-450 BL00023 
390 




BLC002:- 


Type 11 fibronectin 
col lagen- bi ndi ng domain 
proteins . 


BL00023 24.31 8.920e- [ 

24.31 4.545e-27 353- 
3 90 


c 


BL01 1 6 f 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 S.llSe- 
09 863-917 


10 


PRC0464 


E- CLASS P450 GRCUP 11 
SIGNATURE 


PR00464D 17.40 6.1B2e- 

12.41 4.231c-ll 377- 
393 


i: 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR007341 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00022B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00 031 


I F.MUNOG LCBULI N V REGION. 


DM00031B 15.41 3.84Be- 
09 79-113 




FR0C2 Or. 


L>IjiAJJlN /ilS L» L»rJ* t=-JL»U A C.M J N 

SUPER FAMILY SIGNATURE 


30 517-535 PR0020BA 
12.59 2.233e-09 520- 
538 ; 


17 


PDOCOGf: 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 6.200e- ] 
34 282-295 PDO0066 j 
13.92 9.400e-14 477- 
490 PE00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500r- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


le 


BL0084h 


CAP-Gly domain proteins. 


EL00845 16.43 2.200e- 
25 55-80 


20 


BL0C46' 


IMP dehydrogenase / GMF 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BLO0487F 
18.79 8.984e-22 235- 
276 BL004 87G 26.82 
4.082e-12 287-329 


23 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487B 16.12 S.737e- j 
26 154-199 BL00487F 
18.79 8.9B4e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


Biyooio- 


Protein kinases ATP- 
binding region proteins. 


BL00^07A 18.39 3.250e- 
26 302-333 
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g:-:q id NO: j 


ACCESSION 
NO. 


DESCRIFT30K 


RESULTS' 


2 3 j ELobioi 


Protein kinases ATP - 
binding recion proteins. 


BL0O107A 18.3& 2 . 250e- 
26 302-333 




BL001U 


Eukoryotic RNf. 
polymerase II 
heptapeptide repeat 
proteins 


BL00115T 8.4S 7.273e- 
29 1208-1242 £L00115Q 
18.08 2.776e-21 953- 
983 BLOOllSV 11.86 
8.000e-17 1604-165C 
BL00115M 19,19 8.l30e- 
16 731-774 BL0011SH 
14.34 9.392e-16 463- 
496 BL00115A 15.4<i 
7.414e-l5 43-6: 
BLOOllSR 6.50 6.12Be- 
14 983-1010 BL0C11SJ 
16.71 S.289e-14 591- 
617 BL00115I £.33 
4.336e-l3 535-59C 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.0:ie-33 435- 
463 BL0011SX 15.03 
3.417e-10 6l7-€5^ 
BL00115O 16.76 5.80Se- 
10 863-513 BL00115P 
11.54 7.53Be-10 913- 
953 BL00125S 36.24 
7.968e-10 1010-3052 
BL00115U 10.34 4.47Se- 
09 1242-1265 


26 


BL004 2 0 


jucx oty i i- cr *^ t_ l i*-* x. ^ cpcci L 

proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 64-113 


21 


BLO0C50 


Ribosomal pxotein L23 
proteins . 


BL0O05OA 23.71 9.2.50e- 
27 94-127 BL00050P 
14.81 B.125e-12 133- 
147 


28 


FR00925 


NONHT^TONF CHROMOSOMAL 
PROTEIN HMG17 FAMILV 
SIGNATURE 


PR00925B 3.73 3.089'e- 
10 41-54 


29 


PFD0756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 

i 


FMN-dependent alpha- 
hydroxy acid 
dehydrosenases proteins. 


BLO0S57D 17.76 S.065e- 
37 274-316 BL005B7A 
3S.08 8.90Se-25 24-73 
3LO0S57C 15,59 l.OOOe- 
28 227-257 BL00557E 
21.27 8.698e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTxROSINE 
INTERACTION DOMAIN 
SIGNATURF 


PR00629E 9.90 5.866e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786C-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


FD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN . 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.7O0e- 
34 171-207 PD01270C 
19.54 3.455B-30 137- 
166 


36 


PD0227O 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PP01270A 17.22 l.OOOe- 
40 39-79 FD01270B 
22.18 2.675e-38 94-131 
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SEC ID NO; 


ACCESSION 


DESCRIPTION 


RESULTS* 


j 




PD01270D 24 .66 2 ,7 0Gc- 
34 171-207 PD01270C 
19.54 3 .455e-30 137- 
166 


37 


BLG0415 


Neuromcduljr. (GAP -4 3; 
proteins . 


BL00412C 10.28 9.241*- 
10 264-298 


36 


BLOC 4 I 2 


Neuromodulin (GAP-43J 
proteins . 


BL00412C 10.28 9.241e- 
10 264-296 


35 


BLC0412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10 .28 9 . 241e- 
10 264-298 


4C 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
33.18 6.927e-l3 375- 
394 PR00380D 9.93 
2.1B0e-12 425-451 
PR00380A 14.18 5.1S4e- 
12 143-165 


44 


BL0034S 


EtG-ciomain proteine. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL003457v 
13.96 2.452e-14 204- 
223 


45 


BI..00345 


Ets-domain proteins. 


BL00345B 21. 2B l.OOOe- 
40 215-266 RLC0345A 
13.96 2.452e-14 180- 
199 


4 C 


DMC1551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER . 


DM01551A 15.63 3 .53&e- 
26 172-202 DMCT551C 
14.62 3.S71e-l7 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


?7<00876 


NEMATODE ME7ALL0TH1ONEIN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


4fc 


FD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- - 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin c©rrx>xyl - 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 PL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9v471c-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 8L009723 
9.45 8.269e-10 302-312 


51 


BLOC972 


Ubiquitin carboxyl - 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 EL00972A 
11.93 7.120e-18 216- 
234 BL00972E 2C.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8,269e-10 302-312 


52 


BL03315 


GTP-binding nuclear 
protein ran proteins. 


BL0111SA 10.22 3.063e- 
14 10-54 


53 


FFc00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.50Oe- 
17 20-38 PR009fi8F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
FR00988B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PRO0762D 
11,29 4.103e-l9 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS • 








FR00762F 15.12 S.lOOe- 
16 563-563 PR00762B 
12. :2 6.063e-16 23C- 
250 PR0C762E 12.0'/ 
2.286e-15 54S-56S 
PR00762G 14 .13 6 .276e- 
13 601-616 j 


56 


BL00216 


Sugar trcnsport 
proteins 


BL00216B 27.64 8.80Oe- 
10 153-201- 


56 


PF00793 


Domain p:f^er.t in 20- 1 
and Unc5-like netrin 
receptor i 


PF00791B 28.49 2.049e- 
10 1080-1135 


59" ~~ 


PF00793 


Domain present in ZO- 1 
and Unc5-like netrin 
receptor* . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM . 


PD01929E 10.76 9.C38e- 
09 206-221 


6B 


PR0036O 


C2 DOMAIN 1 SIGNATURE 


PR00360A 14 .59 7.395e- 
09 680-693 


69 


FR0036O 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) rionam proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM0 0179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-110 


7? 


BL00239 


Receptor tyrosine kinase 
class 11 proteins. 


BL002393 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.1l6e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOI SOMF.RASE I . 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4 .057e-12 70-83 


80 


PD02676 


DECARBOXYLASE 
PHOSPHATI DYLSKR T Nf . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 
FHOSPHAT1DYLSERINE . 


PD02e76C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.5B8e-12 393- 
430 


83 


BL00708 


Prolyl enaopeptadase 
ramily serine proteins. 


BL0O7O8B 24.91 7.197e- 
12 570-601 


84 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8 . 043e- 
09 985-1004 


86 


PR00678 


B13 KINASE P6 5 
REGULATORY SU3UNJT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD* 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 0.2OOe- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


DL00107A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kineses ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR0O082 


GLUCOSE/RIB1TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PRC0081A 
10.53 2.500e-12 54-72 


98 


PR00380 


K1NESIN HEAVY CHAIN 
SIGNATURE 


PR0038OA 14.18 5.500e- 
24 401-423 PR00380D 
S.93 7.188e-20 613-635 
PR00380B 12.64 7.5l7e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEC ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


10; 


PR00MCO 


ATP - DE F FN DENT CLP 
PROTEASE ATP- BINDING 
SUBJNJT SIGNATURE 


PR CO 3 COA 9.SG 7. 54 be - 
14 289-30t 


ic--. 


BL0 04 7<_- 


Phorbo"; esters / 
diacylglycerol binding 
domain proteins. 


BLC0479B 12 . S7 6 .786e- 
18 298-314 RL00479A 
19.86 4 .9I3e-I6 155- 
178 BL00479A 19.66 
4 .3O0e-l3 272-295 
BL00479B 12.57 6.294c- 
12 181-197 


IOC 


BL0101S ; 


AD?-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


10/ 


DM0J97C | 


0 kw ZK632.12 YDR313C 
ENDOSCMAL III . 


DM01 970B 8.60 5.000e- 
16 403-416 


10fc 


BL00191 


Cytochrome b5 family, 
hcme-binding comain 
proteinic . 


3L00191K 17.38 4.9Sle- 
27 238-282 BL0O19U 
11 .37 6 .447e-17 18?- 
204 


lOi 


PD01366 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.936e- 
37 8-47 


11( 


BL01136 


Scorpion short toxins 
proteins . 


BL01138A 10 . S6 8 .297e- 
10 38-50 


11* 


BLO0107 


Proteir. kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-1B7 BL00107B 
13.31 9.100e-14 225- 
241 


11"/ 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BLO0214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


in 


BL001C7 


Protein kinases ATP- 
binding region proteins. 


BLO0107A 18.39 8.560e- 
13 36-67 


us 


PROOFS- 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
S1G?<JATURE 


PR00529C 11-03 7.506e- 
10 158-177 


120 


PRO 03 20 


G- PROTEIN BETA WU-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


122 


PRO032O 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


BL00215 


Mi tochor.tirial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


12* 


BL01032 


Protein phosphatase 2C 
protei ns . 


BL01032C 6.14 3.19Se- 
12 147-157 BL01O32H 
11.25 5.G80e-ll 316- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8.902c- 
09 379-389 


129 


BL0131 0 


ATP1G1 / PLM / MAT8 
family proteins. 


EL01310 14.74 6.694e- 
26 28-64 


i3i; 


PR0O990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


13 z 


BLOO880 


Acy 1 - CoA- M nd ing 
protein . 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Kukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


13S 


PR0021S 


NEUROMODULJN SIGNATURE 


PR00215C 13.98 6.77Se- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BLQ0026 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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rSEQ ID NO: 


ACCESSION 
NC 


INSCRIPTION 


RESULTS* 


r 

! 






3L00028 16.07 5.50Ce- 
13 74-91 BL00026 
16.07 9.100e-13 186- 
203 BL00026 16.07 
8 . 043e- 12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217ft-i2 270- 
287 BL00028 16.07 

BL00028 16. C7 4.000e- j 
10 108-175 


141 


EL00501 


Signal peptidases J 
serine proteins . 


BL00501D 16.69 9.536e- 
14 113-133 BL00501C 
9 .61 8 680e-10 89-101 


143 


BL0102C 


SARI family proteins. 


Bb01020C 15.35 7.722e- 
20 75-130 


146 


FD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
! BINDING NU . 


PD01C66 19.43 6.400e- 
25 335-374 


149 


BL0G12* 


3 '5»- cyclic nucieoticc 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.95ie-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200c- 
11 483-495 BL00126A 
27.5ft 8.269C-11 442- 
479 


151 


BLOC63; 


Ribosomal protein S4 
proteine . 


BL00632 23.79 S.27ie- 
20 106-149 


154 


BL0055S- 


Eukaryotic tnolybriopter in 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00S59K 
13.17 2.957e-18 172- 
199 BL00SS9J 19.63 
8.385e-13 99-151 
BL00559L 13.60 S.814e- 
12 241-259 


15S 


PR0044S- 


TRANSFORMING PRCTEIN P21 
RAS SIGNATURE 


PR004 4 9A 13.20 l.6 92e- • 
13 13-35 j 


157 


BL0040€ 


Actios proteins. 


BL00406D 12.58 2.547e j 
18 275-330 BL00406A 
9.95 S.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00- 3^ 


Zinc carboxypept idases. 
zinc-binding region 1 
proteins . 


BL00132A 26.07 7 . OOOe- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR0O1OS/ 


TYROSINE KINASE 
CATAL VT I C DOMA I N 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


loo 




Ribosomal protein SI 5 
proteins . 


15 129-172 






Dfc-AD-box BUDtamiJLy AT*'- 
dependent helicasee 

nynFo tn c 


35 640-686 BL00039A 
lfl 44 1 964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR0044S 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.72le- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / KATf 
family proteins. 


BL01310 14.74 2.432c 
29 133-169 


179 


PD0106 6 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 


PD01066 19.43 9.455e- 
36 6^45 



J96 



0153312A1 l_> 



WO 01/53312 



PCT/USU0/J4263 



, 5EQ ID NO: 


ACCESS I or 
NO. 


DESCRIPTION 


RESULTS* 


-— 




BINDING NU. 




180 


PRC 000'. 


COMPLEMENT ClC DOKA.I N 
SIGNATURE 


PR00007B 14 .16 7.429e- 
20 160-180 PR00007A 
19.33 4 . 938e-l9 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 236-249 


D63 


BL00027 


' Kom&obox ' domain 
proteins . 


DL00027 25.43 9.526e- 
24 280-323 


182 


BL00027 


* Home obex' domain 
proteins . 


BLC0027 26.43 9.S2be- 
24 263-306 


193 


BL0002"> 


' Hcneobox' domain 
proteins. 


RL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


'Hotr.eobox' domain 
proteins . 


BL00027 26.43 9.52be- 
24 263-306 


18b 


PR00929 


AT- HOOK-L3KE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.32Re- 
09 46C-471 


189 


PRC 092 £• 


AT-IJ0OK-L3KE DOKA I K 
SIGNATURE 


PR0D929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specitic 
protein phosphatases 
proteins . 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-l7 162- 
177 BL00383E 30.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
11 628-639 BL00383K 

3 & . 51 1 -720C-13 373- 
387 BL00383C 10.10 
3.000e-13 217-226 
BL00383D 11.92 7.000c- 
13 295-308 BLC0383E 
7.61 3.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II . 92 4 -000c- 09 589- 
602 BL00383B 7.6: 
e.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
35 83-105 PR00450C 
12 22 G.?86e-13 47-69 


193 


PF00564 


Octicosapeptide repeat 
proteinc . 


PF00564B 24.74 6.l64e- 
16 227-278 


194 


PR00503 


BRCKODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.57le-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta- synthase P- 
phosphate att. 


BL009O1C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dr.aJ domain proteins. 


BL00636A 8.07 6.2lle- 
17 40-57 BL00636E 
15.31 2 . 000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9-866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEO VIRUS 0RP6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- , 
12 509-522 


203 


DM0021S 


PROL1NE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW TENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR0C261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 FR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 



197 



BNSDOCID: <WO 0153312A1_I_> 



WO 03/533)2 PCT/t'SOO/34263 



SEQ JD NO: 


ACCESSION | DESCRIPTION 
NO. | 


RESULTS* | 








4.B33e-l8 14 3 -16£ 
PRO0261D 12.47 7.S00e- 1 
IB 143-165 PRO0261E i 
14 .12 5.065e- 16 65- e: 
PR00261C 11.37 8.960*- 
16 143-165 PRO0261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11 .57 7. 188e« 
13 65-87 PR00261?. 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO- ; 
slid Unc5 — like netrin 
receptors . 


PF00791B 28.49 6. 143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT CIO DOMAIN 
SIGNATURE 


PROO0C7A 19.33 5.791e- 
19 131-158 PR00007B 
14 . 16 4 .115e-18 158- 
178 PR00007C 15.60 
1.675e-lS 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


'Jbiquitin-con-jugat inq 
enzymes proteins. 


BL00183 28.97 1.54be- 
30 43-51 


213 | BL00183 

1 


Ubi qui t i n- conjupa 1 1 nc 
pn?.ymes proteins. 


30 43-91 


21b 


BL0003 9 


DEAD- box subfamily ATP- 
dependent hclicases 
proteins . 


nifinm^fi *>i to t q n n »» - 

OliV UUjlfU <: Jt . O / i .JUUC" 

29 568-614 BL00039A 

BL00039C 15.63 1.720e- 

19.19 4.064e-ll 277- 
303 


217 


BLO0100 


Chi orampheni col 
acetyl tranei eraof 
proteins . 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 

■J JL vjinl UAL 


FR00213C 15.94 3.969e- 
11 199-227 


222 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1 .947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTH I ONE I N 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


22b 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


22€ 


BL00636 




BL00636A 8.07 1.000c- 
21 21-38 BL00636E 
15.11 8.200e-19 45-66 


229 


PR00302 


7 0 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G 
13.78 4.300t»-l2 361- 
382 


230 


BLO0460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL0O460B 
9.73 7.429e-l6 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclins proteins . 


BLO0292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR0044? 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PRO0449A 13.20 6.308e- 
13 7-29 PR00449C 



1!>8 



BNSDOCJO: <WO 0153312A1_L> 



WO 03/5331? 



l'C:T/L l S»tl/34263 



SEQ ID NO: 



ACCESS J OK 
NG 



"23? ' " | pr cocas 



prooois 



23*7 



! FD00285 



240 



241 



244 



24S 



2 sir 



PR00011 



FRO 00 11 



EL00903 



DM00173 



PR00927 



DESCRIPTION 



LEUCTNE-RK 
SIGNATURE 



H RKPEA1 



LEUC1NE-RICH REPEAT 
SIGNATURE 



PROTEIN SK3 DOMAIN 
REPEAT PRE5VNA. 



TYPE III EGF-IjIKE 
SIGNATURE 



TYPE III EGF-MKr 
SIGNATURE 



Cytidir.e anc 
deoxycy t i dyl a t e 
deaminases zir.c- binding 
region s . 



w KINASE ALPHA ADHESION 
T-CELL . 



Wnt-1 iamiiy proteins. 



ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 



RESULTS * 



17.27 4.462e«13 4~7"cf~ 
PRC0449D 10 .7? 7.120e- 
11 105-123 



PR00019E 11.36 7 .30Cr- 
10 251-265 PR0001SE 
11.36 5.320e-09 11? 
133 PR0OO19B 11.36 
l.O00e-08 229-243 



PR00019B 11 .36 7 .30Cc 
10 245-259 PROQOlSt 
11.36 5.320e-09 113- 
127 PR00019B 11 .36 
1.000e-08 223-237 



PD00289 9.97 8.448o~09 
67-81 



PROC011D 14.03 
10 616-635 



3 .4S2e- 



PR00011D 14 . 03 3 492e- 
10 616-635 



BLOC903 12.93 8.941 fc - 
12 54-64 



DM00179 13.9? 8.043e- 
09 124-134 



BL00246D 23 . 97 1 .OOOe - 
40 166-239 BL00246E 
20.32 1 . 000e-40 305- 
351 BL00246B 13.69 
4.176e-36 1C5-140 
BL00246A 15.75 2.286e- 
24 70-90 BLO0246C 
15.56 4.8S7e-22 150- 

175 

"PR00927E 14.93 5.114c- 
10 253-275 



254 



DL00G7 4 



AAA-proteir. ismily 
proteins . 



BL00674B 4.46 l.OOOe- 
09 223-245 



PD01796 15.01 6. 04 5e- 
09 61-88 



255 



255 



PDC17 9* 



BLS0002 



PROTEIN TRANSMEMBRANE 
COBALT 21NC CADMIU. 
Src horoclooy 3 1SH3T 
domain proteins profile. 



BL50002B 15.18 2 . 800e- 
10 421-435 



259 



259 



PR00094 



BL008 92 



ADENYLATE KINASE 
SIGNATURE 



PR00O94C 12 . 94 2 . 2O0e- 
^8 87-104 PR00094D 
12.52 2.731e-14 161- 
177 FR000P4A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00054E 
11.25 7.333e-13 178- 
193 



H3 T f amily pr 6t e l nr> . 



BL00892A 18.17 S.5O0e- 
13 60-91 



262 



264 



BL00388 



BL00903 



Proteasome A- type 
oubunits proteins. 



BL00388A 23.14 I.OOOe- 
40 8-S4 BL00386B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BLO0388C 
18.79 8.147e-l€ 126- 
148 



Cytidine anc ~~ 
deoxycytidyiate 
deaminases zinc-binding 
region s. 



BL00903 12.53 5.821e- 
09 91-101 



267 



EL.0010 7 



Protein kisses ATP- 
binding reoion proteins. 



BL00107B 13.31 1.529e- 
09 241-2S7 



270 



BL00226 



Intermediate filaments 
proteins. 



BL00226D 19.10 I.OOOe- 
37 362-409 3L00226B 



199 



BNSDOCID: <WO 0153312A1 J_> 



WO 01/5331? 



PCT/liSOO/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23 . 86 6 . 0'i3e 3S 196- 
244 BL0C226C 13.2? 
7.000e-20 261-252 
BI.00226A 12-77 6.143e- 
15 96-111 


271 


PD02952 


kinase transferase: 

CHOLINE PROTEIN 
KULTIGFjNE FAM1. 


PD02952C 15 . 76 5. 731e- 
16 235-265 PD029S2D 
15.57 5.625e-09 215- 
229 


272 


PD0292S 


ADHESION GLYCOPROTEIN 
PRECURSOR 2. 


PD02929A 2e.27 1.000c- 
40 106-160 PD02929E 
18.36 8.800C-17 179- 
199 


274 


BL03O27 


Glycosyl hydrolases 
family 39 proteins . 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14 . 32 6.451c- 
11 39-59 


277 


BLC0OS2 


Niboscmal protein S>7 
proteins . 


BL00052A 27.eS 6.000e- 
13 137-184 PL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13 .25 S.6S9e- 
13 267-294 


200 


PR0Q319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11 . 64 6.62£e- 
23 107-125 FR00319C 
33.41 1 .000e-21 89-105 
PR00319A 15.27 8.364c- 
21 51-60 PR00319B 
.11 .47 8 .200e-l9 70-85 


281 


PR00319 


BETA G - PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00315D 11 .64 6.625e- 
23 94-112 PR00319C 
13.41 1 .000e-21 76-92 
PR0031SA 15 .27 8. 364e- 
21 38-55 PR00319B 
11.47 8 .200e-l9 57-72 


287 


PF00929 


Exonuci ease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14 .01 2. 360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


295 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


29^: 


BL0002H 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16. -07 S.500e- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4 .600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.5S0e-13 788- 
805 BL00028 16.07 
3.34Be-12 704-723 
BL00028 16.07 6.478e- 
12 461-478 BL0C028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.l54e-ll 732- 
749 BL00028 16.07 
5-846e-ll 377-394 
BIiOG028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 



200 



BNSDOCID: <WO_0153312A1_l_5 



WO (11/53312 



PCT/US00/3 



■1263 



SEQ ID WO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS • 








BL00C28 lb. 07 4.C86C- 
09 bl7-r,34 BLO0O28 
16.07 7.429e-09 485- 
506 


2 St. 


BL00215 


Mitochondrial eneiuy 
transfer proteins. 


BL0021SA 10.82 8.333e- 
16 111-13S BL00215A 
15.82 2.723e-ll 10-35 
BL0021SB 10.44 9.526e~ 
11 .152-1 6S BL00215B 
10.44 7.3756-10 59-72 
BL00215A IS. 82 9.824e- 
10 205-230 


302 


[>F00953 


Glycosyi transff.rnsc. 


PF00S53C 19.70 8.773e- 
34 236-269 PF00953A 
29.68 S.O0Oe-25 102- 
129 PF00953B 6.17 
l.OOOe-13 182-194 


304 


PP00152 


t RNA synthetases class 
13 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.2S0e-21 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.68 S.714e- 
11 44-67 


305 


PD01066 


PROTEIN 2 INC FIKCER 
21NC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
F1BONUCLEOPR0TEJN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PK004S4 


ETS DOMAIN SIGNATURE 


PR004S4C 11.24 7.808e- 
09 1167-1186 


30* 


PR00237 


RKODOPS1N-LIKE GPCfc 
SUPER FAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8 . 94 4 .750e- 10 137-1S9 
PR00237F 13.57 5.364e- 
10 230-2SS PR00237B 
13. SO 9.438e-l0 57-79 


309 


BLC0522 


DNA polymerase family x 
proteins . 


BL00522C 11.90 7.677e- 
24 315-339 EL00S22F 
14 .90 3 .310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
EL00522E~19.63 8.615e- 
14 430-460 BL0052.2B 
27.30 9.625e-12 267- 
313 


31C 


3L00326 


Tropomyosi ne proteins. 


BLO0326D 8.76 5.235e- 
10 8S6-897 


312 


3L0029O 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL0029CA 20.89 4.706e- 
14 151-174 EL00290B 
13.17 9-D00e-12 211- 
229 


313 


BL00345 


Ets-domair. proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.2i7e~16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 


BL01020 


SARI family proteins . 


BL0102OC 15.35 3.l98e- 
17 79-130 


318 


BL00216 


Suger transport 
proteins . 


EL00216B 27.64 4.696c- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 



20) 



BNSDOCID: <WO 0153312A1 J_> 



YVQ 01/53312 PC T/0S00/34263 



5 Eg 3D NO: 


ACCESSION 
NO. 


DESCRIPTION | RESULTS • 

1 






SIGNATURE 




3?1 


BL00027 


'fcomecjucx* domain 
pioi frine 


5L00027 26.43 5.688e- 
10 329 372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


. ;R0010SB 12.27 8.76Se- 
12 558- 577 


324 


BL01241 


Ljnx domain proteins. 


BL01242 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222e-13 282- 
335 


326 

1 


BL00412 


Neurcmociul jn (GAP- 43) 
proteins. 


EL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
S67 BL00412D 16.54 
7.848e-10 518 569 
P.L00412D 16.54 l.B27e- 
09 514-565 BL00412D 
16 .54 1.918e-09 513- 
564 BL00412D 16.54 
2.102^-09 520-571 


328 


BL002 3 2 


Cadherms extracellular 
repeat proteins domain 
prote ins . 


PL00232B 32.79 9.SS7e- 
1 20 151-199 BL00232B 
' 32.79 2.246e-18 41-89 
EL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32 . 79 5 . 50Oe- 16 258- 
306 B1.00232B 32.79 
9 - 3B4e-15 475-523 
DLC0232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
3G6 BL00232C 10.65 
7,261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR004S4 


EM'S DOMAIN SIGNATURE 


PR00454C 11.24 7.B08e- 
09 1167-1186 


331 


BLO0S98 


Chrotno domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


3 33 


BL01016 


Glycoprotease family 
proteins . 


PL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5 . 286e-lS 149- 
177 8L01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL03016G 
7.14 5.622e-10 261-271 
EL01016A 5.6S 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
£.855e-09 38-5C 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


EL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


FD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kanesin light chain 
repeat prcteins. 


EL01160B 19.54 5-042e- 
C9 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


FD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


CK00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 13S-154 


34 7 


PR00109 


TYROSINE KINASE 


PR00109B 12-27 4.764e- 



202 



BNSDOCID: <WO 01S3312A1 J_> 



WO 01/53312 



PCT/USlW/34263 



SE'J ID NO: 


Accession 

NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 13 5- 1S4 


35: 


3L0U87 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1 .7fi3e- 
13 1G0-116 BL0nB7l 
12-04 8.435e-23 276- 
292 BL01187B 12.04 
8.300e-ll 13-2S 
BL01187B 12.04 7.429i.»- 
10 54-70 BL0118 7B 
12.04 5.725C-09 23: 
247 BL01187A 9.98 
7.000e-09 255-267 


35^ 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078D 
13.14 4.522e-09 168- 
181 


354 


BL003 8 0 


Rhodanese proteins. 


BL00380F 9.76 6.094e- 
11 542-553 


355 


PFOC626 


PKD- finger . 


PF00628 15.84 1 OOOc 
11 116-131 


356 


PR0C587 


SOMATOSTATIN 3ECEP70* 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


355 


PD0C066 


PROTEIN ZINC- FINGER 
METAL- BIND1 . 


PD00066 13.92 4.462e- 
15 261-274 FD0006S 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4 . 30Oe-O9 289-302 


361 


PFOC791 


Domain present m ZO- 1 
and Unc5-2ike netrin 
receptors . 


PF00791B 28 .45 9 .604e- 
13 54-109 PF00791D 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e 
09 71-126 PFO07S1B 
28.49 7.440e-O9 184- 
239 


362 


PF00V91 


Domain present in ZO-l 
and Unc5-li)ce netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5-080e- 
10 73- 95 PR00450C 
12.22 3.278e-09 109- 
131 


364 


PF0O241 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242O 13.51 2.328e- 
09 22-68 


365 


PF0O242 


DNA polymerase (viral) 
N-terminal domain 
proteins . 


PF00242O 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


3L01160B 19.54 6.644e- 
09 1038-1092 


367 


PRO 001 9 


LEUCINE- RICH REPEAT. 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 SI-305 
PR00019A 11.19 8.667e- 
09 370-3B4 


366 


proooi: 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9-O00e- 
15 30-49 PROOOllA 

PR00011B 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 



203 



BNSDOCID- <WO 01S3312A1_I_> 



WO 01/5331? 



PCT/USOO/34263 



SEQ ID NO: 


ACCESSION j DESCRIPTION "1 
NO . | 


RESULTS* . 




10 88-118 " ~ . 


380 


BLOO107 


Protein kinases AT.-- 
binding region proteins. 


BL00107A 10. 39 3 . OOOe- 
23 276-307 BLOOlOVP 
13 .31 1 ,692e-12 342- 
358 


381 


BL004 55 


Putative AMP-bir.dir.c 
domain proteins. 


PL00455 13.31 S.714e- " 
12 50-66 I 


382 


PR00624 


HI STONE H5 SIGNATURE 


PR0O624G 4.08 4.900e- 
09 524-544 


384 


PE00075 


REPEAT PROTEIN ANY* 
NUCLEAR ANKYR . 


VD0OO78B 13. J4 5.950o- 
10 366-379 PD0007BB 
13.14 <.522e-09 168- 
281 


385 


FRC0511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR I NTF.R T.RUKJ N- .1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


3 8 8 


0 U b D 


rKUifclN /j 1 r J JNLrtvh 

METAL - BIND I . 


PD00066 13 .92 5 000e- 
13 316-529 


3 8 9 


BL0 02 90 


Immunoglobulins anc 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.6$7e- 
09 151-174 


390 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 S.200e- 
15 221-246 BL00215A 
15.82 7.6l8c-14 20-45 
BL00215A 15. 82 8 .85le- 
11 123-148 BL00215B 
10.44 9.526C-1J 69-82 
B1.00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein ramily 
proteins. 


BL00674B 4.46 2.723c- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 B.579e- 
11 141-155 


396 


PR00761 


BINDIM PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL0C240 


Receptor tyrosine R:nase 
class III proteins . 


BL00240B 24 .70 7 . 907e- 
10 118-142 


401 


PF00676 


Dehydrogenase EJ 

t-UmJUC/nCilL . 


PF00676B 24. 71 £.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C If .88 
9.182e-14 454-470 


402 


BL00S14 


Fibrinogen beta anc 
gamma chains C- terminal 
domain, proteins 


BL00514C 17.41 4.673e~ 
28 4432-4469 BL00514G 
15.98 6.092e-14 4B55- 
4585 BL00514D 15.35 
2.532e~12 4473-4486 
EL00514F 11.65 4.288e- 
10 4519-4534 BLO0514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


FF00992A 16.67 5.974e- 
09 105-140 


4 04 


PRO0019 


LEUCINE- RICH REPEAT 
SIGNATURE 


FR00019B 11.36 1.450e- 
10 73-87 PROOOISA 
11.19 8.043e-l0 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B , 
11.36 1.000e-09 96-110 


405 


BL00222 


Cadberins extracellular 
repeat proteins domein 
proteins. 


BL00232B 32.79 9-S57e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BLC0232B 32.79 5.985e- 
18 358-406 BLO0232B 
32.79 5.500e-16 246- 
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S'^Q ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


KfcSULTS * 








294 iHl002 32ir32 !vV 
9.384e-15 463-51:. 
BL00232C 10. 65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10. 6i 
7.261e-ll 461-475 
BL00232C 10.65 7.4S7e- 
11 27-45 




PF004 2* 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15. 67 S.634e- 
09 902-940 


405 


BL01160 


Kinesln light chain 
repeat proteins. 


BL01160B 19.54 9.69Se- | 
09 126-180 


41C 


BI.OC741 


Guanine-nucl eoti de 
dissociation stimulators 
CDC24 family sign. 


BL00743B 14.27 2.731e- 
09 252-275 


411 


PF0C646 


F-box domain proteins. 


PF00646A 14.37 6.344e~ 
09 86-100 


41 ? 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL0 06 03B 11.39 8.50Oe- 
09 542-557 


415 


BLO086* 


Carbamoyl -phosphate 
synthase svbdomain 
proteins . 


BL00066B 36.29 3.571e^ 
31 245-291 EL00866C 
23.26 9.000e-25 333 - 
366 


41 fi 


P300239 


M0LLUSCAN RHOD0PS1N C- 
TERM1NAL TAIL SIGNATURE 


PR0023SE 1.58 6.1146-- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.955e- 
14 23-78 PFO0791P 
28 .49 3 .653e-l2 27? 
328 PF00791B 28.4S- 
4.273G-11 156-213 
PF00791B 28.49 7.818e- 
11 89-144 PF00791£ 
28.49 1 .524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PP00791C 20.9fc 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PFO0791P 
28 .49 7 . 028e-09 435- 
490 PF00791B 28 . 4 5- 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


42 5 


PRO0109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR001Q9D 17.04 5 .881e- 
10 228-251 


429 


BL00S18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


43i 


3L00C39 


DEAD- box subfamily ATP- 
dependent beli cases 
proteins. 


BL0O039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 S.6l5e-19 205- 
244 BL00O39B 19.15 
8.920e-l6 251-277 
BL00039C IS. 63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B -1 -65 7.652e- 
12 169-185 


433 


PK00B28 


FORM1N SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL0041S 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BLO041SN 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR0O834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
KC . 


DESCRIPTION 


RESULT.V * 






pl5. 


10 183-218 ?F0ii40D 
15.54 3 .093e-09 246- 
281 


44 9 \ 


PK00S6* 


DO PAM j NT. D3 RECEPTOR 
SlGNATURi" 


PHC056DG 13 . 9f; . c . . 551e- 
09 39-5/ 


451 


PF0O084 


Sushi aoi^ain prot eins 
{SCR repeat proteins. 


PF0008QB 9-4S 3.813e- 
10 47-59 


452 


BL0079C 


Receptor tytcsine kinase 
class V proteins. 


BLC07S0I 20.01 2.822e- 
09 618-649 


456 


PR0C38C 


KTN5SIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.16 l.COOe- 
25 77-99 PR003BOD 
9.93 1.000e-21 281-303 
PR00360C 13.18 8 286e- 
17 230-249 PR003B0B 
12.64 4 .724C-U 154- 
212 


4S7 


PR00253 


GAK14A- AMI N09UTYR I C ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 S'.543e- 
24 246-267 PR00253B 
13.47 2.0O0e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-326 
PR00253D 16.68 S.950e- 
21 452-473 


467 


PR00649 


GLYCOSYL HYDROLASE 
FAMILY 56 SIGNATURE 


PR00849D 9.77 5 ?36e- 
09 910-937 


473 


BL00678 


Trp-Asp iWD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteint . 


BL00226E 23.86 3.721c- 
09 282-330 


473 


BLO0344 


CAT A- type zinc finger 
domain proteins. 


BL00344 17.99 7.000c- 
12 814-852 


474 


DL00481 


Thiol-act ivated 
cytolysins proteins. 


BL004B1E 13.07 6.909e- 
09 173-159 


4 79 


PK00319 


BETA G- PROTEIN 
( TRANS DUC IN) SIGNATURE 


PR0O319S 11.47 ?.S71e- 
09 393-408 


480 


PD01066 


PROTEIN 2. INC FINGER 
ZINC- FINGER METAL- 
BINDING NU- 


PD01066 19.43 2 900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19-41 1 . OOOe- 
19 453-473 PR00405B 
11.83 4.333e-lB 430- 
448 PRC040SA 17 . 71 
4.97le-18 411-43: 


482 


PR00049 


WILN'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9 .^86e- 
10 95S-974 PR00049D 
0.00. 9-857e-30 958-973 
PR00049D 0.00 3 .305e- 
09 937-952 PR00049D 
0.00 8.322e-05 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8 . 61Se- 
23 653-673 PR00007A 
19.33 6.392e-22 626- 
653 PR00007C 15.60 
5.846e-i9 698-720 
PR00007D 9-64 3.647e- 
13 732-743 


467 


PDC0S67 


PROTEIN RNA- BINDING RNA 
KtrfcAT HIu. 


PD00567B 18.23 2.853e- 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6 .39 4 .569e- 
12 3-21 


489 


PD01O66 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19-43 4.e82e- 
27 30-69 PD0106C 
19.43 3.430e-10 71-110 


490 


PR0O049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprocein (coat 


PF00429 31.08 7.17le- 
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t SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


496 


BL00120 


Lipases, serine 
protein*. 


UL00120E 11.37 7 . 923e- 
09 185-200 


SOO 


PL00D30 


Euknryotic RNA- binding 
regjon RNP-1 proteins. 


BL00030A 14.39 7 . 353e- 
21 299-31£ 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


ELC1159 13.85 8.579e- 
12 131-146 


5C5 


BL00021 


Krmgle cDmam proteins. 


FL00021B 13.33 3.739e- 
17 4 92-510 


508 


PRG0120 


H* TRANSPORTING ATP AS E 
(PROTON PUMP) SIGNATURE 


1 H00120C 9. 90 S.fcOOe- 
19 705-72? 


S09 


DM01 4 17^ 


6 kw INDUCING X?MC2 
MUSHROOM SPAC22G7 . 04 . 


DM01417E 20 . 62 2 . 938e- 
•G 362-395 DM01417D 
11.08 3.800e-l3 322- 
331 


*10 


PF00534 


Glycosyl transferases 
group 1 . 


FFC0S34B 14.47 6 . 625e- 
09 346-37C 


511 


PF00534 


Glycosyl -ranslerases 
group i . 


PF00S34B 14.47 6 . 625e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 


FF00S343 14.47 6.625e- 
05 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 1.000e-40 181- 
222 PD01841D 17.87 
i .OCOe-40 243-29i 
PD01841F 13.36 1 . OOOe- 
40 333-382 FD01841G 
24 .26 1 . 000e-40 386- 
440 PD01841L 16.42 
1 . 000e-40 968-1010 
FD01841I 23 . 00 4 . 54 5e- 
i7 762-804 PD01841E 
18.60 3.750e-36 295- 
?33 PD01841J 14.94 
6.023C-35 851-886 
PD01841II 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088C-33 924- 
^54 PD01841C 13.78 
9.396e-23 222-243 
PD01841K 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILlN PEPT1DYL- 
PROLYL CIS- TRANS 
ISOMERASE SIGNATURE 


FR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4 .150e-12 122-138 


515 


3L0074 0 


MAM domain proteins. 


EL00740A 13.87 7.188c- 
12 410-423 


51b 


DM00892 


3 RETROVIRAL PROTEINASE. 


LM00892C 23.55 6.087e- 
12 1018-1052 


517 


DL00242 


Integrins alpha chain 
proteins . 


BL00242C 16.86 8.320e- 
C9 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL»00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


PL00319C 17.12 8.375e- 
10 61-9S 


526 


PF00789 


Domain present in 
ubiqui tin- regulatory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 

zeta-crystallin 

proteins. 


BL01162C 22.80 l.SOOe- 
16 120-164 
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SEC ID NO" 


ACCESSION descjuptjo;. 

NO. 


NESUI/T.V- 


S2S 


PR00910 1 LUTECV1RL C . 0RF6 PROTEIN 
; S3GNATUKI 


PK0C91CA 2 . bi 3 .893e- 
09 50- T- 


531 


EL0022 5 j Witoc.nor.C2- :al energy 
I transfer pioteins. 


IS. 82 4 . OOOe- ' 
17 11-36 BL00215A 
15.32 8.660e-ll 123- 
148 




BL00215 


Mitochoncr :ai energy 
transfer p:oteins. 


3L0C225A 15.82 4.000e- 
15.82 £.660e-ll 97-122 


534 

i 


BL00O98 


Thioiaset <cyl -enzyme 
intermec--^ :c proteins. 


BL00098C 21 .G5 2.800 c - 
38 181-227 B1.00C98B 
32.59 5.345e-38 86-141 
BLO0OS8D 26 .30 8.364 e - 
35 245-286 BL00098E 
22 . 12 1 .OO0e-34 314- 
352 3L00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.4S5e- 
11 38-50 


S3E> 


FR00370 


FLAV3 K - CON 7A I N I NG 
M0N00XY GTNASE <FM0) 
SIGNATURE 


PR00370E 11 . 96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-?l 185- 
204 F300370F 17.75 
6.559e-21 376-296 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-2D 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


53 6 

t 


BL00028 


Zinc linger, C2H2 type, 
domain proteins. 


BLO002B 16.07 7.429e- "~" 
16 285-302 BL0OO28 
16.07 6.294e-14 341- 
358 BL00O28 16.07 

I . 346e- 11 369- 386 
BLOC028 16.07 1.692e- 

II 397-414 BL0002B 
16.07 4 4S2e-ll 453- 

7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


j 527 


BL00762 


WiiEP-TRS ODTidin 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BI,00762 


WHEP-TRS dornaan 
proteins 


BL00762A 23.43 9.419e- 

i.J 0l7"03t> 


53^ 


BL00762 


WHEP-TRS detain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-85* 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR0O985A 12.10 9. OOOe- 
10 3S7-375 


54: 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16,74 l.OOOe- 
40 3-47 PD02102B 

_ 0 . & o 1 . 0 f J*f j ' 1 y u 

PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 6.929e-26 100- 
146 


542 


EL0002B 


Zinc finoei, C2H2 type, 
domain prottine. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.00De-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta lorruly 
proteins . 


BL00250A 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5.2€6e-24 354- 
390 


54 7 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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j SEO H> NO: 

j 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS' 

i 






( TRANS DUC1N) SIGNATURE 


09 3U6-201 PR00315A , 
15.27 7.344e-09 210- 
227 


54f 


BL01204 


NF- xapps-B/Kel /dorsal' 
domain proteins. 


RMU204A 17.74 l.OOOe- 
40 e-S6 BL01204D 
16.42 1 . D00C-40 177- 
221 BL01204E 13.83 
7.6S2e-3D 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4 333C-16 102- 
116 


54S 


PR00326 


GTP1/CBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-270 


551 


PF00632 


HECT- domain (ubiquitin- 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569- 1601 PF0O632B 
IB. 45 3.700e-21 1515- 
154? 


554 

1 
I 


BL00290 


immuncblobui inc and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 




DM00215 


PROLINfc'-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DMOllll 


4 kw PHOSPHATASE 
TRANSFORMING 61 K PDF1 . 


DM01113L 11 . 93 3 .762e- 
09 7-31; 


S62 


PF00658 


Poly-arienylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455c- 
32 118-155 


56 4 




CUAdi yUL it duu vi x ui 

aspartyl proteaeee 
proteins . 


10 472-488 


SGt 


PF00655 


PVIWP domain proteins. 


PF00855 13.75 5.667e- 
1 5 272 - 2 8 9 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER WETAL- 
BINDING NU. 


PD01066 19.43 4 . 977e- 
13 229-268 


565 


BL00107 


Protean kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-l5 183- 
199 


bVL 


BLOO107 


» Protein kinases ATP- 
hinding region proteinc. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
195 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1 . 8S7e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115*135 PR00293E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-13S PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL0Q752B 19.17 9.703e- 
10 9B5-32S 


576 


BLC0030 


Eukaryotic RNA-bindins 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
L 09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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HEQ ID NO: 


ACCESSION 
NC. 


DESCRIPTION 


RESULTS* 






proteir.6 . 


13 864-877 BL00116r 
11.82 i.529e-12 952 
965 


518 


BL00195 


Glutaredoxin proteins 


RL0019SB 15.31 7.15fce- ; 
OS 121-14: 


579 


PR00019 


LEUC1NE-RICH REPEAT 
SIGNATURE 


PKOC019B 11 .36 9.000c- 
11 217-231 PRO0Ol9t 
11.36 1.3G0e-09 386- 
400 PR00019A 11.15 
3.333e-09 389-403 
PR00019B 11.36 B.920*-- 
09 363-377 


5£0 


PR00253 


GAMMA- AMI NOBUT YR I C ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.125c* 
25 275-296 PR00253h : 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
S.B46<?-23 444-465 
PR00253C 13 .85 2.241e- 
20 335-357 


583 


FR00343 


SELECTIN SUPERFAMILV 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16 . 05 2 . 286^- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-l0 1686- 
1705 


584 


DM01537 


kw SKI2W SKI 2 NUCLEOlAk 
HE1.ICASC. 


DM01537B 21.63 1.876c- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 CM01537A 15.14 
3.196e-ll 784-804 


586 


PFC001? 


KH domain proteins 
family of RNA bindino 
proteins . 


PF00013 5.78 1.450C-C9 
124-136 


587 


DM0089? 


3 RETROVIRAL PROTEINASE. 


DM00B92C 23.55 4.409t- 
13 2.62-296 


S89 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1 .643e- 
13 261-276 BL00478B 
14 .79 7.709e-09 321- 
336 


550 


PF00855 


PWWP domain protein*. 


PF00855 13 . 75 S.OOOe- 
35 931-948 


S91 


PFO085S 


PWWP domain proteins 


PFG0855 13.75 8-Q00e- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e 
12 424-439 


554 


PR00205 


CADHERIN SIGNATURE 


PRO0205B 11 .39 2.241e- 
16 558-576 PR0020SA 
14 .73 9.30Be-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11 .39 4.273* - 
10 336-354 


556 


BL00107 


Protein kinases ATP- 
binding region proteanD. 


BL00107A 18.39 4.789e- 
18 307-338 


sse 


PD0167S 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.3~30e- 
10 55-39 


600 


BL00242 


lntegrins alpha cha:n 
proteins . 


BL00242E 9.03 9,591e- 
27 985-1014 BL00242C 
16 .86 4 ,115e-26 286- 
316 BL00242D 13.57 
4.15*0e-25 357-382 
BLO0242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 




ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 







5.00Ce-li 61-73 ~~ ' ' 
BL00242D 13 . S7 4.yKte- 
10 291-316 


6C1 


FR0032C 


G- PROTEIN HETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16 .74 5.61Ce- 
09 198-213 


602 


FRO 02 7 & 


FANCREATIC HORMONE 
SIGNATURE 


PR00278A 12 .43 4 .56S< - 
10 331-348 j 


603 


BL0047S 


Phorbol esters / 
diacylglyc-:rol bindinc 
domain proteins. 


BL00479C 12.01 3.2^0e- 
12 170-183 


604 


BL00315 


Dehycirms proteins. 


BL00315A 9.35 1.672c 
09 424-452 


605 


DL0041E 


Synapsins proteins. 


BL00415N 4 .29 9. 794c 
10 295-339 


606 


PR00926 


H I TOCHONDK I AL CARRIER 
PROTEIN SIGNATURE 


PR00526F 17.75 1.000*^ 
13 335-358 


608 


PF0085S 


FWWP domain pioteins. 


PF0085S 13.75 S.167e- 
15 265-282 


609 


PF0085^ 


PWNP domain proteins. 


PF00855 13.75 5.167e 
15 211-228 


612 


DM01 20fc 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10 .69 7 .43 j<- - 
10 877-897 DM01206F 
10.69 8.027e-10 86a- 
881 DM01206B 10.65; 
9.137e-10 873-893 
DM01206B 10:69 1.456*-- 
09 859-879 DM01206P 
10.69 1.797c 09 87*- 
899 DM01206B 10.6? 
4.076e-09 865-885 
DM01206B 10.69 7.038<- - 
09 898-91B DM01206E 
10.69 7.949e-09 871 
891 DM01206B 10. 6S 
8.29le-09 767-787 


61S 


PD0269S 


PROTEIN DNA- RINDING 
BINDING Dm 


PD02699A B.91 2.023^- 
28 129-158 PD0269SC 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
I.000e-17 158-182 


616 


PR0038C 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0O380A 14 . 18 4 . 066e- - 
22 288-310 PR0038OD 
9.93 3.72le-17 486-50^ 
PRO0380B 12 . 64 2 .241e- 
16 410-428 PR00330C 
13.18 2.9?6e-13 436- 
455 * 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086c- 
22 288-310 PR00380C 
9.93 3.721e-17 486-SOfc 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976C-13 436- 
455 


618 


DM012O6 


CCR0NAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM012C6B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR007COB 16.80 3.l60e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 64*7-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC KOLYBDOPTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NWDH 
dehydrogenase 75 Kd 


BL00641C 21.10 1 . 000c- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO 


DESCR1 FT] OK 


RESULTS* 






subunit proteins. 


24.37 1.000e-40 2 S5 - 
308 BL00641F 33 . 12 
1 . 000e-40 571-f 7 3 
BLO0G41A 17.15 l 818e- 
37 4B-8C BL0CC41B 
12.62 5.846e-34 113- 
139 BL00641D 33 .23 
S.308C-29 216-1.40 


627 


PR00103 


GAKP- DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17. 8C I.SOOe- 
18 367-380 PR00103B 
13 .39 2 .080e-l< 297- 
312 PR00103A ^.59 
2.957C- 14 282-: 5'. 
PR00103D 10.83 3 .077e- 
12 346-358 PRCC103C 
15.68 1.000e-l] 334- 
344 PR00103B 13 .39 
1.450e-ll 175-IM- 
PR00103A 9.59 3 720e- 
10 160-175 


631) 


PR00061 


GLUCOSE/R1BITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


FR00081A 10.53 e Alle- 
le 4-22 


631 


PF00651 


BTB la} so Known as BR- 
C/Ttk) domain proteins . 


PF006S1 15.00 e.sooe- 
14 37-50 


632 


DM012 06 


CORONA VIRUS NUCI.EOCAPSID 
PROTEIN 


DM01206P 10. 69 ; .233e- 
10 1324-1344 DK01206B 
10.69 4 . 822e-10 1276- 
1296 DMC1206B 10.69 
7.G58e-lC 1328-j348 
DM01206B 30.69 e.274e- 
10 1280-1300 r,K012 06B 
10.69 4 .532e-0S 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinaocs ATP- 
binding recion proteins. 


BL00107A 18.39 ?.6 00e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
22*5 


636 


BL00657 


Fork head domain 
proteins . 


BL00657A 19. 39 1 .54Se- 
30 101-143 BL00C57B 
22.27 7.7S0e-26 149- 
192 


637 


3L00107 


Protein kinases ATP- 
binding recion proteins. 


BL00107B 13.31 : . 000c- 
10 607-623 


643 


BL00018 


EF-hand cal cium- bindi ng 
domain proteins. 


BL00018 7.41 4.9i3e-09 
199-212 


647 


PF00628 


PHD- finger . 


PF00628 15.84 2.3S0e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BL01129 


Hypothetic^} 
yabO/yceC/sf hB family 
protemc . 


BL01129E 13 .25 4 . 000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
275 BIi01l29B 12.51 
6.118e-13 191-21: 


649 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.i>08e- 
10 455-480 


6SO 


BL0002 7 


1 Home obox ' doma i n 
proteins . 


BL00027 26 .43 6 664e- 
13 771-814 


651 


BL50002 


Src homology 3 {SH3 ) 
domain proteins profile. 


BL5C002A 14.19 1 . 750e- 
12 1026-1045 


653 


PR002S3 


GAMMA- AMINOBUTYR1C ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9. IS 4.C00e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13 .47 
3.l43e-22 279-3C: 
PR00253D 16.68 7.652e- 
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SEQ ID NO- 


ACCESSION 


DESCRIPTION 


RESULTS * 


| NO. 






| 


20 422-441- 


654 


PD0171? 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452c- 
11 969-997 PD01719A 
12.89 3 . 96 le-10 12fc- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12. G9 1 . 222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 


BL00354C 6.61 8 .397e- 




binding domain proteins 
(Ahook) . 


09 563-57F 


f CO 




MMft- T ->r^r» UM^, V HNA- 

xiriw- 1 una mpjvj i 

binding domain proteins 


BL00354C 6.61 8 .397e- 
09 580-591. 


659 


DM0021S 


PR0LINE-R1CH PROTEIN 3- 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.7S0e-12 549- 

582 DM00215 19.43 
9.824e-31 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19 .43 4 .054C-1C 550- 

583 DK00215 19.43 
S,339e-10 £52-585 
DM00215 19.43 7.107c- 
10 544-577 


660 


PR00608 


XYLOSE ISOMERASE 
SI GNATURF 


PR006881 13.78 9.516e- 
09 224-236 


661 


BL00027 


•Homeobox" domain 
protei ns . 


BL00027 26.43 5.950e- 
23 249-29: 




n 0 r\ r» i r\ 




PR00360B 13.61 7.1S8e- 
10 596-630 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13. 61 7.158e- 
10 596-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360E 13.61 7.158e- 
10 596-610 


666 


PR00819 


CPXX/CFQX SUPERFAK1 L"Y 

O J vjINrt 1 UKL 


PR00819B 10.83 U.900e- 
10 704-720 


667 


BL50040 


ElongeMon factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-17* 


ecu 
t>o 0 




iiriir'TMir Diru dtdpit 

L>tt Ul_ I INI tL - K J KtrLMl 

SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-152 PRO0O19A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-17, 


670 


BLOOOIB 


EF-hanri calciun- binding 
domain proteins. 


BLOOOIB 7.41 3.25Ce-10 
681-694 PL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP- BINDING TRANSPORT 
TRANS MEMBR . 


PD00131B 34.97 l.OOOe- 
34 356-41C PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PRC0667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4 . 8S7e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
65C PR00320C 13.01 
8.43Se-ll 717-732 
PR00320C 13.01 2.800c- 
10 635-650 PRO032OC 
13.01 6.400e-10 593- 
608 PR00320B 12-15 
3.250e-09 593-608 


675 


PR0O32O 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16.74 4 . 857e- 
13 572-587 PRO032OB 
12.19 4.1l5e-12 614- 
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SSO'jU NO: 


ACCESSION : DESCRIPTION 
NO. 


RESULTS * 








629 FR00320C 13 . 01 
8.435e-ll 696-711 
PR0C320C 13 .01 2 . BDOe- 
10 614-629 PR00320C 
13.01 6.4OOe-10 572 
587 PR0032OB 32 . 19 
3 .250e-09 572-587 


676 


PR0001S 


LEUCINE- r<lCH REPEAT 
SIGNATURE 


PR0O0I9A 11. IS 9.667e- 
09 245-263 


679 


PF00642 


Zmc finoer C-x8-c-x5-C- 
x3-H type {and similar) . 


PF00fi4^ 11 .59 3 .700e- 
16 225-236 FF00642 
11.59 7.900e~12 187- 
198 


680 


PR0O300 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3 .83 8 .754e- 
10 266-296 


681 


BL-00019 


Actanln-cype actin- 
binding domain proteins. 


BL0DO1SD 15.33 4 ,200e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4 . OOOe- 
09 99-118 


6 87 


PK00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subuni t PR55 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 BLO1024B 
8.91 1 . OOOe-40 66-127 
BL01024C 7.80 l.OOOe- 
40 146-285 BL01024D 
13 .22 1 .000e-40 185- 
222 BL01024E 11.96 

I . OOOe-40 222-266 
BL01024F 9.42 1. OOOe- 
40 266-317 BL01024G 

II . 09 1 . OOOe-40 317- 
349 BL01024H 13.88 
1 . OOOe-40 389-442 


691 


BL00O27 


* Home obox ' doma i n 
proteins . 


BL00027 26.43 8.07le- 
31 152-195 


t 92 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050c- 
09 45-57 


f 93 


BL002I1 


ABC transporters family 
proteins - 


BL00211A 12.23 5.050e- 
09 45-57 


( 94 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 58-70 


(96 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL00680 14.37 5.304e- 
17 173-195 


t 97 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.41Be- 
11 242-265 


( 98 


DM01930 


2 kw FINGER SMCX SMCY 
YDR0 96W. 


DM01930E 15.41 1.367e- 
37 170-235 DM01930F 
14 .16 8.232e-28 267- 
303 DM01930B 19-86 
9.163e-10 37-71 


•^oo 


PROOB69 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


TCI 


PR00048 


C2H2-TYP2 ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10. S2 6.870e-10 133- 
14 7 PR0004 8A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.S6Se- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BLO0523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SKQ ID NO: 


ACCESS 1 OK [ DESCRIPTION 
NO. | 


R.-SULTS* 








J 48 BL00523D 9.89 ~~~ 
1 . B44e-ll 290-302 
EL0Uh23G 9.4 6 5.50Ge- 
10 SK--523 BL00523J- 
1 0.8S 6 .351e-09 413 
4 24 


1 


PRO 004 e 


C2H2-TYPE ZINC F2NGER 
SI GNATURE 


PR^O048A 10.52 8 . 412e- 
12 376-390 PR0004efa 
6 . 02 1 . OOOeOO 334-344 
FRC0048B 6 .02 1 .474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00707A 14.84 8.94ie- 
14 66-82 


708 


PR00761 


BJKDIN FRECURSOR 
SIC- NATURE 


PR00761E 14.32 8. 500e-~ 
10 822-841 


712 


DM01 ^4 


k\j T3 AN^CR" PTAJsF EFVFR9F 

f\ V' 1 XfU 1 ! ... \— J\ — * 1 / V»D H AL. V ul\0 (T* 

31 OKF2 . 


DM013h4Y 10.69 4 . 977e- 
38 42^-465 DM01354X 
13.86 7 .300e-34 376- 
4 15 LM01354V 12 . 97 
4 . 923e-l7 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAX)-box subfamily ATP- 
cependent hclicases 
prcteins . 


3L00039D 21.67 7.S4Se- 
27 4 50-4 96 BL00039A 
16.44 2 .537e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BLOOOJ'JB 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
pictcinu • 


B1>00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sial yl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. " 


DM00031A 16.80 3.7S0e- 
39 20-66 DMOO031B 
15.41 2 .688e-28 84-110 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BLCO'243 


InLegrins beta chain 
cys lei ne- rich domain 
proteins . 


BI,00243B 17 . S4 1 .OOOe- 
40 131-172 BL00243C 
16 .42 1 .000e-40 172- 
2C8 3L00243D 24.07 
1 . 000e-40 222-274 
EL00243F 22.63 1. OOOe- 
40 314-358 BL00243I 
31 .77 6 .571e-39 607- 
650 BL00243E 16.70 
3 . 077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL0O243H 17.53 
5.304e~ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


4 3 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 [ 


CALF A 1 N CYSTEINE 
PROTEASE <C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.905e- 
34 135-161 PR00704F 
13.61 7.000e-26 130- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 1 
NC. 


DESCRIPTION 


RESULTS' 








PR00704E 27.94 2.241e- 
23 75-98 FRO0704A 
14 .68 4 .( ^4e-l9 30-^4 
PR00704C 11.88 1 .871e- 
ie 99-11'- 


725 


PROOliK 


TROPOMYOSIN SIGNATURE 


PR00194A -J.86 7.6~52t- 
09 169-1E7 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PS00194A 7.86 7.652c- 
C9 169-187 


727 


PR0032O 


C— PROTEIN BETA WD- 4 C 
REPEAT SIGNATURE 


PR00320C 13 .01 2 .)2be- 
13 277-292 PRO032OA 
16.74 1.3l0e-ll 277- 
292 PR0052OC 13.01 
4.522e-ll 323-338 
PR0O32OA 16.74 6.586e- 
11 323-33B PR00320E 
12.19 4 .343e*10 323- 
338 PRO032OB 12.19 
6.914e-lC 277-292 


731 


PR0019S | DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-13 457-474 


733 


PF00642 


Zinc finger C-x8-c-x5-c- 

xj-n type iana Eimi inr ; . 


FF00642 1 ) .59 9.082e- 
10 787-79f 


73 H 


BLOO039 


DEAD- box subfamily ATP- 

er-in-i Viol -i r-aCft*" 
OPpeTJClCill. lie J- a 9C& 

proteins. 


BL00039A 16.44 2.56be- 
28 26-65 BL00039D 
21.67 2 .105e~20 338- 
384 BLO0039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617c- 
1J 73-99 


739 


BL01289 


TSC-2 2 / dip / bur. 
family proteins. 


BL01289A 12.18 8.909c- 
31 326-35> BL012B9B 
10.45 9 .S7le-17 353- 
383 


742 


BL01O19 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.0/Be- 
12 41-81 


743 


BL00965 


Phosphomannose i some rase 
type I proteins. 


EL00965C 23.78 1 . OOOe- 
40 256-30?. BL00965B 
17.77 1.6C0e-25 126- 
153 BLO0S65A 10.57 
6.400e-l? 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24 . S6 4 . 563e- 
25 231-273 BL00021B 
13.33 5.345e-2l 60-78 


748 


BL00612 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR0045O 


RE COVER IN FAMILY 
SIGNATURE 


PR0O45OC 12.22 6.880e- 
lO 135-157 


752 


BL007 9S 


Involucrin proteins. 


BL0O795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444C-11 370- 
415 


754 


BLC0051 


Rlbosomal protein L3 9e 
proteins. 


EL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 lew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B fc.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15. 3S 9.020e- 
12 99-150 


762 


3L00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- » 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR . 


PDC2411 21 .89 9.l37e- 
10 206-240 


764 


3L00027 


' Homeobox • doma in 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


8L01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
10 309- 324 BL01208B 
15.83 8.031e-10 165- 
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BNSDOCID: <WO 01?.3312A1 J_> 
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1 SEQ ID NO: 




ACCESSION 
NO. 


DESCRIPTION | RESULTS* "j 

I 








1 On ul ,m "7 r, i» ri k r □ 

4.l62e-09 £5-100 ^ 


f no 


EL>OO03 j 


Noel ear hcraonts 
receptors DNA-bindmc 
region proteins 


BL00031A IS. 55 9.57ie- 1 
32-208-243 3L00031B 
22.25 G.5O0e-27 242- 
274 


112 

i 


PR0044S 


TRANSFORMING PKHTHIN P2; 
RAS SIGNATURE 


FR004 49A i: .20 1.450u- 
16 4-26 FP.00449E 
13.50 3.520e-14 142- 
165 PR00445C 17.27 

PR00449D 30 . 79 8. 579e- 

ii i m ioi i)v nr\A a ou 
1.9 JU / -1ZJ rKUU4fl?K 

14.34 3.4E5e-ll 27-44 


i 

\ 


rt one 7 


fulfetasos prott-.ms. 


BL0Oi>23b IV. 27 9.3i3e- 
23 299-329 BL00523A 

1 j JD Z.ZUL'e-JLJ 4 /- OS 

BT00523B 8.64 2.607e- 
13 91-103 M.00S23D 
9.e9 7.9236-12 224-236 
BJ0C(J523C 2^.64 4.5J2e- 
10 141-152 BL00522F 
10.85 5.823e-10 373- 1 
384 


IIS 


BL00026 


Zinc finger. C2K2 type, 
oomain proteins 


3LOC028 16.07 7.6H6r- 
09 568-585 


116 


BLOOC26 


Z.nc finger, C2H2 type, 
domain proteins. 


BLC0028 16 . 07 7 . 6fifce 
09 621-636 


111 


BL00028 


Zinc finger, C2K2 type, 
ccmain proteins. 


BL00028 16 . 07 7 . 68bc 
09 595-612 


77 8 


BL00030 


hukaryotic RNA-fcinding 
region RNP- 3 proteins. 


BL0003 0A 14 3 9 8.43ie- 
11 322-341 BL00030A 
14.39 7.000e-30 220- 
239 


77 5 


PR0007t 


C J ,UC0SE - 6 - PHCS PHa.TE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.92<?e- 
26 193-222 PR00079E 
16.65 4 .15Ce- 23 348- 
375 PR00079C 8.68 
6.353e-l6 246-264 
PR00079D 13 .51 7.070f - 
16 264-281 PR00079A 
16.12 6.769e-23 169- 
183 


763 


BL00215 


Mi t ochendri al energy 
transfer proteinr . 


BL00215A 15 . U2 9.250e- 
17 10-35 BL0021SA 
15.82 6.000e-16 221- 
246 BL00215A 35.82 
7.857e-12 106-133 
BL00215B 10.44 S.526e- 
11 168-181 


783 


PD00299 


REPEAT PRESYNA. 


PD002H9 9 97 6 ?76£-()9 
159-173 


785 


BL.0O6 90 


LEAH- box subfamily ATP- 
dependent helicasee 
proteins . 


BL00690B 13. 3B l.OOOe- 
12 147-165 BL00690A 
6.87 5-320e-10 114-124 
BL0069OC 7.51 3.l89e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P23 i 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-l4 8-30 
PR00449E 13 . 50 2 -853e- 
11 150-173 PR00449D 
10.79 1.5456-09 111- 
125 


?es 


DM01206 


COKONAVIRXJS NUCLECCAPSlD i DM01206B 10.69 8.767e- 
PROTEIN. i 10 1-21 


790 


BL0091S 


Phcsphat idyl inositol 3- 
anc 4-kinases proteins. 


HL00915C 22.43 9.182*- 
39 725-764 BL00935B 
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BNSDOCID *WO 0t53312A1..l : 



WO 01/5.1312 
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SEC ID NO: | ACCESS 1 ON 
1 NC. 


DESCRIPTION 


RESULTS* 








22 .78 5 .050t'~33 63? 
671 BL00915U 27.02 
1.529e-21 7S5-e33 
BL00915A 10 .09 I .OOCe- 
13 395-407 


791 


PR0020B 

1 


GLIAD1N AWD LMW GLUTEN IN 
SUPER RAM I Li' SIGNATURE 


PROO208A 12. S9 6.294c- 
10 120-138 PR0O2O8A 
12.59 6.294e-10 121- 
139 PR002O6A 12 .59 
6.294e-10 122-140 
PR00208A 12 .59 6 .294e- 
10 123-141 PH00208A 
12 .59 6 .294e-10 124- 
142 PR00206A 12.59 
6.294e-10 125-243 
PR00208A 12 .S9 6 .294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-l0 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PRO02O8A 12.55 
7.658e-09 131-149 
PR0020BA 12.59 7.9046- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR002O8A 12.59 
8.274C-09 119-137 


795 


PR00205 


CADHER1N SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PH00205A 
14.73 1.257C-11 284- 

1.333e-ll 337-352 


796 


BL00412 


i>eur ornciaui in \ v_j/\ i -j / 
proteins . 


BL00412D 16.54 4.000c 
12 196-247 BL00412E 
16.54 5.705e-13 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL0C412D 
16 .54 1 .9l8e- 09 194- 
245 BL00412D 16 . 54 
2.102e-09 201-252 


797 


EL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 


BL01052 


Calponin family repeat 
proteins. 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.2S7e- 
25 52-78 BL01052D 
10.26 5.737e-2S 174- 
194 


800 


BL0D34 6 


p53 tumor antigen 
proteinB. 


BL00348F 23.19 3.7l4e- 
09 197-240 


801 


DL00309 


Vertebrate aalactoside- 
binding lectin proteins. 


BLOO309C 18.65 1.62le- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyr i dme 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16. 4'/ 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM* KET3NAL GPCR 
SIGNATURE 


PR00667C 11.71 5.875e- 
09 12-28 


810 


PD02346 


PHOTOSY STEM 31 PROTEIN 
PRECURSOR 


PD02346F 12.89 4 .340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS . 




Bll 


BL0068S 


CBF-A/NF-YB si:r^nit 
proteins . 


BL00665B 14.41 6 . 779e- 
14 54-95 EL00685?-. 
11.22 4.798e-13 S-S4 


812 | 


FR0008O 


ALCOHOL DEHYDROGENASE 
SUPER PAW J LY S 1 C-N ATURE 


PR0O080A 9.32 9.4l9e- 
10 93-10S 


813 •• EL00357 

1 


HisLone H2B proteins. 


BL0O3B7 7 . 74 1 90fce-17 
22-65 


81 b 


FDO 00£>6 


PROTEIN ZINC-FJNGER 


PD00066 13 . 92 7 .?23e- 
15 158-172 PC0006C 
13.92 5.200e-14 46-59 
PD00066 13.92 7.0O0e- 
14 10-31 PD0006* 
13 92 7 OOOc- 13 130- 
143 PD00066 13. SI 
7.500e-13 214-22- 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13 .92 4 .429e-l2 186- 
199 PDC0066 13.51 
1.783e-ll 74-87 


816 


BL01195 


rct/L iuyj i t\i »/i jiyux u a aac 

proteins. 


£> XJ v X . ^ J ^ *L \J 1 J , J j u c 

20 100-139 


820 


BLC0520 


Interleukan-l 0 tamily 
proteins . 


BL00520A 6 .21 6.47le- 
09 1-14 


822 


BL00972 


Ubiguitin carbo>ryi- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11 . 53 8. 113e- 


"825 


PR00876 


NEMATODE MKTALLO'J HIONEIN 
SIGNATURE 


PR00876B 7 .66 2.268c- 
10 101-llS 


823 


PD028 5 5 


FLAVOPROTE3N PROTEIN 
DNA/ PANTOTHEN . 


PD028S5A 18.37 4 . 732c 
28 88-124 PD02Bbtl3 
8.36 6.476e-09 132-142 


830 


PR004O5 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11 . 83 7. OOOe- 
21 44-62 PR00405C 
19.41 1 .000e-13 65-87 
PR0O4O5A 17.71 7.283e- 
13 25-45 


831 


PKOC019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.39 l.OOOe- 
09 47-61 PR00019F 

11. Jb J. . /ZUc* J. f - 

150 PROOOIPB 11.3fc 
^ Rqnp.fici ac c. q 

_3 . O O U CT ^» :> O 


B32 


PR00011 


TYPE III EOF- LIKE 


PROOOIIB 13.08 3.438e- 
16 164-181 PROOCllD 
14.03 6 .850e- 16 164- 
183 PR00011A 14.06 
8.364e-14 164-183 
PROOOIJC 24 .25 5. 415e- 
12 231-260 PROOOllD 
14. C3 9.852e- 11 212- 
231 


834 


PD0O306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7. 000e- 
12 232-246 


83S 


PD003O6 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PDOO306A 10.26 4.00Oe- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-R1CH PROTEIN 3. 


DM00235 19.43 3.858e- 
09 78-111 


835 


PD027B4 


PROTEIN NUCLEAR 
R I B0NUCLEOPROTE I N . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.09le- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 



IV) 



BNSDOCID: <WO 01535' 2A1_L> 
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rSEC H> NO: 

I 

I 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS- 1 








11 538-54 5- PR00700?. 
37. S7 3.300e-l0 522- 
53 1 


642 


PRO 01 09 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00209B 12.27 5.404b- ~~ 
13 134-152 


a 44 


PDG27 85 


PROTEIN RIBOSOMAL 60S 
L22 RNA- BINDING HEP. 


PD02785B 34.43 1 .000e- 
4C 58-112 PD02785A 
15.23 l.?:5e-28 8-57 


845 


BLC0826 


KAKCKS tamiiy proteins. 


BL00S26C 7.63 6.738e- 
05 203-53 C 


840 


BL00518 


Zinc tinger, C3HC4 type 
(RING finger), proteins. 


BL0C518 12.23 4,429e- 
10 15-24 


34? 


BL00518 


Zinc finger, C3HC4 type 
(KING finger), proteins. 


BL00518 32 . 2 3 1 .000e- 
08 340-34? 


850 


PRC0308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PROG308A S. 90 6 .506c- 
09 12-27 


853 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 23 . 89 7 .000e- 
16 246-28C 


e52 

j 


BL00420 


Speract receptor repeat 
proteins domain 
prot einc . 


BL0042OB 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.32Se-38 933- 
988 BL00420B 22.67 
8.4S7e-28 482-537 
BL0042OB 22 . 67 4 .500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 3L0C420B 22.67 
4 . 20be-26 163-218 
BL00420B 22 .67 5.731e- 
23 55-110 BL00420B 

22 .67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11 90 1 .900e- 

23 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11 .90 5 .119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. . 


BL00420B 22.67 1.000c- 
40 7SG-B11 BL00420B 

22 .67 1 .32le-38 966- 
1023 BL00420B 22.67 
8.457e-28 482-537 
BLC0420B 22.67 4 . SOOe- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 3L00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 

23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.8C0e-15 863-918 
3L00420C 11.90 1.900e- 
13 355-366 BL00420C 
11,90 1 .900e-12 841- 
852 BL0C420C 11.90 

3 . 550e-12 248-259 
BL0C420C 11.90 2.83le- 
11 341-152 BL0042OC 
11.50 5.119e-ll 1051- 
1062 BL00420C 11.90 
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BNSDOCID: <WO 0153312A1_L> 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RKSUbTS * 








7.955e-lC 567-578 




PR00386 


3 * , 5 ' - CYCLIC NUCLEOTIDE 

class i: 

PKOSPHODI ESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 G4-83 


85? 


BL00030 


Eukaryotic RNA- binding 
region RWP-1 proteins. 


BLD0030A 14.39 2.92Se- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000c- 
10 128-147 


B62 


PR00986 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4 .250e- 
17 23-41 PR00988C 
13.64 8.7l4e-16 107- 
123 PR00988F 12.23 
7.626e-lS 198-212 
PR00988E 0.27 9.763e- 
12 176-288 PR009B8D 
5,95 8.2S0e-ll 163-174 
PR00988B 11.60 4 . 512e- 
10 60-72 




BL00215 


Mitochondrial energy 
transfer proteins. 


BL0021SB 10.44 8.07lc- 
12 41-54 


86 < 


PR0077S 


90 KD HEAT SHOCK PROTEIN 

O - v?r>M J UlvL 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3 .52 1 . 837e-23 107-130 
PR00775D 8 .91 4 .484e- 
17 171-289 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 FR00775G 
10. 64 6 . 850e-15 267- 
286 PR00775F 12 . 76 
6.769e-14 249-267 


866 


DM01 6 88 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


86*; 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3 ' -terminal 
phosphate cyclase 
proteins . 


BL012B7A 17.95 2.688e- 
26 16-48 


B6& 


DM00215 


PUOLINE-R1CH PROTEIN 3. 


DM0021S 19.43 6 ,464e- 
10 304-337 


872 


BL0O04 6 


Hi stone H2A proteins. 


BL00046 12 . 95 3 .000e- 
40 30-85 


874 


BL00188 


Biot in-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 S.036e- 
32 665-711 


876 


BL-00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16 .07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL011B9 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 

40 35-71 BL01189B 

13 .49 1 - 000e-4O 71-126 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL0O284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOS PHATIDYL I NOS I TOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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BNSDOCID: <WO 0153312A1J_> 
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SEQ ID NO: 1 


ACCESS 1 OK 
NO. 


DESCRIHTION 


RESULTS* " 1 






r.SIGNATUR£ 


09 313-328 | 


89 fc 


BL0003S 


DEAD- box subfamily ATF- - 
dependent held cases 
proteins . 


3L0C039D 21.67 7 . BOOe- i 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.15 
1.947e-13 1S3-179 
3L00039C 15-63 9.460e- 
11 236-260 


90: 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIKDI . 


PD00066 13.92 8.200e- 
16 254-267 PDO0066 j 
13.92 8.200e-I6 282- 
295 PD00066 13.92 
8.200e-16 310-323 
PD00066 13.92 8.2O0e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00D66 13.92 
B.^otie-14 338-351 


902 


BI. 01115 


GTP-bir;ciing nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR0O806 


VINCUL2N SIGNATURE 


PRO08O6B 4.28 9.160e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381P 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381C 
13,94 1,0846-22 291- 
309 PR00381F 9.13 
3.2R8e-22 370-392 
PR00381F 9.13 7,181e- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR003B1E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-39B 
PR003B1D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR0C345 


STATHM1N FAMILY 
SIGNATURE 


PR00345C 4.54 6.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.S4 8.557e- 
09 513-537 


908 


BLC0676 


Trp-Asp <WD} repeat 
proteins proteins. 


~BL00678 9.67 9.308e-ll 
144-155 


910 


PDO106C 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BLO1104 


Ribosomai protein Ll3e 
proteins . 


BL01104C 15.14 6.000t>- 
09 364-352 


922 


3L00678 ~~ 1 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-0S 
500-511 


923 


PR00J20 


G- PROTEIN BETA WD-4D 
REPEAT SIGNATURE 


PR0032OC 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTO CHLOR OPH YLL1 DE 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-54 


926 


BL0001S 


Actinin-type actm- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL0OO19B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BLD0019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp IWD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID KO: 


ACCESSION 
NO. 


DESCRIPTION i RESULTS* 






proteins proteins. ■ 2'J3-284 BL00678 S.67 

! 1 . GOOe-iC 314-32T 
1 EL006 78 9.67 7.600e-10 
| 360-371 BL00678 3 . 67 
! 6 .579e-09 206-210 


929 


BL0C518 


Zinc finger, C3HC4 type 
(RING ringer), proteins. 


EL00518 12.23 1.6b7e- 
10 137-146 


930 


BL01085 


Ribulcs e- phosphate ."j - 
epiroerase family 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL010E5B 
10.15 5.680C-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL01085 


Ribulose-phcsphate 2- 
epimerase family 
prote: ns . 


I GLC1085D 16.55 4.600e- 
24 152-183 BL010U5B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BLOlOeSC 
21.81 2.038e-14 66-97 


533 


PD90301 


PROTEIN REPEAT MUSCLE 
CALCIUM- BI. 


PD00301A 10.24 6.400e- 
09 160-171 


S36 


PF0O168 


C2 domain proteins. 


PF00168C 27 .49 4 . OOOe- 
12 336-362 


937 


BL00415 


Synapsins proteins. 1 BL00415N 4.29 9.5l9e- 

1 10 5-49 


940 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16 . 17 4 . 086e- 
09 63-84 


94 5 


BL01230 


RNA met hyltransferase 
LrmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


94 8 


BL0C475 


?horbo3 esters / 
diacylc lycerol binding 
domain proteins. 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PR0TE2K OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PD01311A 30.23 S.909e- 
10 66-311 


955 


PK0 06 51 


BTB (a iso known as 3R- 
C/Ttk) domain proteins. 


PF006S1 15.00 3 .2S0e- 
12 47-CO 


956 


PF006S1 


BTB (also Known as BR- 
C/Ttk) domain proteins. 


PF00651 15. 00 3 ,250e- 
12 47-60 


957 


BL0 0379 


CDP-alcohol 

phospha t i dyl t rans f err, ses 
protein*;. . 


BL03379 24.64 1.610e- 
35 111-148 


959 


BL01115 


GTP-bincing nuclear 
protein ran proteins. 


PL0U15A 10.22 1.884e- 
10 31-75 


960 


BL01115 


C-TP-bir,cmg nuclear 1 BL01125A 10.22 3.438e- 
protein ran proteins. 1 14 110-154 


962 


BL00061 


Short -chaiD 1 BL0006aB 25.79 6.586e- 
dehydrogenases/reductase I 13 198-236 
s family proteins. j 


963 


PR00502 


MUTT DOMAIN SIGNATURE j PR00502A 15.06 8.200e- 

| 11 210-225 


966 


PR00308 


TYPE I ANTIFREEZE | FR00308A 5.90 7.C35e- 
PROTEIN SIGNATURE | 09 55-70 


967 


DM02206 


C0R0NAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
1C.69 5.299e-ll 23-43 
DM012O6B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
126 DM01206B 10.69 
5.671e-09 38-58 ; 


969 


PF01008 


Initiation factor 2 
subunlt . 


PF01008B 25.59 4.724e- 
31 417-460 PF01006C 
12.25 5.333e-l8 506- 
526 PF01008A 20-14 
5.87Se-l5 369-390 
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SEQ ID NO: 


ACC aSSjOK 
NO 


DESCRIPTION 


RESULTS* 


97C 


3LC12 77 


Ribcnuclcase ; 
proteins . 


BL01277C 10 .IS 7 . 648e- ■ 
10 112-143 BL01277A 
17.39 9.806e-10 40-76 ; 


975 


BLC11S9 


WW/rspS/wwp cicT.air. 
proteins . 


BL01159 13 .85 3 . 60Se- 

12 130-145 BL021S9 

13 .85 4 .I22e-10 171- 
186 


977 


PFC0791 


Domain present in ZO- 1 
and Unc5-like netrin 
receptors . 


PF00791C 20.96 2.235e- 
09 55-94 


978 


BLOlltw 


Ribosomai protean 117 
proteins . 


BL01167B 20.66 8 . 258e- 
19 88-127 


979 


BL00476 


LIM domain protein: . 


BL00478B 14.75 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


FK00312 


CALSFQUESTR1N SIGNATURE 


PR0D312E 8.32 3 423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PB00312O 
13.73 5.688e-34 363- 
392 PR00312D 9.4 3 
2.636e-33 128-158 
PR00312C 15.14 8.639e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PRC0312A 
13.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 6.816e- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALOI N 
SI GNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL0115C 


Respiratory- che m NAD}1 
dehydrogenase 2 0 Kci 
subunit proteins. 


BL011SOB 17.16 l.OOOe- 
40 156-202 BLO1150A 
14.10 8.200e-39 1O0- 
138 


986 


BLC0795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BLD0795C 17 .06 3 . 407o- 
10 14-59 BL00795C 
17.06 7.802eO0 2-47 
BL00795C 17.06 8.6 4 0e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.80Oe- 
09 3-48 


987 


3L0U93 9 


Ribosomal protein LOe 
proteins. 


BL00939F 17.27 S.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURl 


PR0O452B 11.65 G.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


FR0O452B 11.65 6.538e- 
11 497-513 


994 


3L00027 


•Horeeobox 1 domain 
proteins. 


BL0O027 26.43 2.500e- 
25 146-189 


997 


BL023D4 


ubiH/COQ6 roonooxygenaae 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01 76 7 


5 TRANSMITTER DOMAIN. 


DM01767B 10. 07 7 . 868e- 
09 22-39 


1000 


P.R00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 l.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6-211e-23 217-240 
PR00926E 11.70 6.62Se- 
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SEQ ID NO: ~ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926S 








16 .07 2 . 12SC-18 24-35 








PR00926A 10.41 l.OOCe- 








15 11-25 PR00926T 








17.75 S.565e-09 120- 








143 


1005 


BL0Q4O6 


Actins proteins . 


BL00406B 5.47 l.OOOe- 
40 86-143 BL00406C 
6.75 1.000e-40 147-202 
BLO04O6D 12.58 3.700e- 
40 270-325 BLO0406E 
8.44 7.375e-3B 327-377 
BL00406A 9.95 3.34fie- 
29 11-46 


1006 


BL004O6 


Actins proteins . 


BL004O6B 5.47 l .000e- 
40 88-143 BL00406C 
6.75 1 .000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


1007 


PR003O4 


TAILLESS COMPLEX 
1 POLYPEPTIDE 1 

(CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8 .69 4 .667e-20 98-118 
PR003C4B 11.60 7.577e- 
19 68-87 PROC304A ; 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1005 


PD01066 


PROTEIN ZJNC FI K-GEK 
ZINC- FINGER METAL 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0OS1B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LI GASF 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.39le- 
32 261-302 PD0C930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohist idine 


BL0017SA 15.42 5.179e- 






12 6-26 BL00175C 






proteins . 


23.75 8.062e-10 79-111 


1025 


PRO03O5 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


3L003S3 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 3L00353C 
14 .83 8. 844e-ll 288- 
335 


1028 


BLajOI 8 3 


Ubiqui tin- con jug at ing 
enzymes proteins. 


BL001B3 28.97 l.ilOe- 
33 43-91 


1033 


PF0058 0 


UvrD/REP hell case. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR004 1 3 


HALOACID 

DEHALOG2NASE/ EPOXIDE 
HYTlPOl JVQF F&M T I.Y 
SIGNATURE 


PR00413E li>- to 3.429e- 
09 154-171 


1037 


PD0106 6 


PROTEIN ZINC FINGER 
ZINC* FINGER METAL - 
BINDING NU. 


PD0106S 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4 .259e- 
11 55-32 


1039 


BL00299 


Ubiqui tin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARG1N1NE ADF- 
RIBOSVLTRANSFERASE 


PR00970A 17.73 6.143R- 
20 56-78 PRO0970D 
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SEQ ID NO. 


ACCESSION 


DESCRl PTION 


RESULTS * 




NO. 










SIGNATURf. 


S.St 2.154e-18 154-171 








PROCjrVOr 12 30 1 . OOOe- 








16 224-241 PR00970G 








9.97 9.225r--T5 2.47-258 








f ROCS /0B lr.37 1 . 2 90e - 








13 £6-105 PR00970C 








11.06 1 . 643e - 1 1 115- 








130 PR00970E 11 .23 








Q Bor.o 1 i "ni 01D 

f.c^oe-iJi ^u<i-zjo 


3042 


BL00676 


Trp-Asp (WD) repeat 


BL0067 8 9.67 2.200e-10 






proteins proteins. 


< H 3 - ^ bfl 


1043 


PR0004 8 


C2H2-TYPE ZINC FINGER 


PROC04 8A 1C.52 6.786e- 






SIGNATURE 


13 134-128 PR00048A 








XV. l.UUue-03 1 1 Z- 








186 


104 5 


BL 00615 


C-type lectin dcxo.m 


BL0C625A 16.68 1.720e- 






proteins . 


11 218-236 BL00615E 








12.25 l.ob/e-10 Jl /- 








333 


1046 


BL01092 


Adenylate cyclase? 


BL01092N 13.54 8.924e- 






class-I proteins. 


10 3-40 


104 7 


BL01216 


ATP-citrate lyase / 


BL01216D 21 . 75 4 . 316e- 






succinyl-CcA li gases 


28 334-344 BL01216A 






family proteins. 


13 .91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V RLGION . 


DMOO031B 15.41 7.610e- 








12 102-136 


1050 


BL01073 


Riboscmal protein L24e 


BL01073 24 .30 1 .000e- 






proteins . 


40 12-62 


1054 


B1.00571 


Amidases proteins. 


BL00S71 25.69 5.875e- 








31 360-212 


1055 


BL00030 


Eukaryotic KNA-bjnding 


BL00030A 14.39 5.235e- 






region RNP-i proteins. 


11 98-117 SL00030* 








7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 


BL0C223C 24 . 79 8. 7S4e- 






domain proteins . 


23 262-317 DL00223A 








15.59 9.478e-'-4 46-80 








BL00223A 15.59 5.557c- 








11 138-152 


1060 


BL00027 


• Home obox * dcma i n 


BL0CO27 26.43 3.455€- 






proteins . 


35 15e-201 


1064 


BL00455 


Putative AMP-binding 


BL0C455 13. .^1 6.213c- 






domain proteins. 


13 280-296 


1065 


PR00019 


LEUCINE- RICH REPEAT 


PR00019A 11.19 2.00Ce- 






SIGNATURE 


09 135- 129 PR00019E- 








11.36 3.B8OC-09 87-101 


1066 


PR00326 


GTP1/OHG GTP- BINDING 


PR0O326A 8.'/5 4.600e- 






PROTEIN FAMILY SIGNATURE 


16 153-172 PR00326C 








9.79 1.290e-14 200-216 








PR00226B 16.74 8.548e- 








14 172-191 PR00326D 
















236 


1071 


PD02870 


RECEPTOR INTF.RLEUKI N- 1 


PD02E70B 18. 84 B.blbe- 








11 16-1-197 


1072 


P?00856 


SET domain proteins. 


PF00056A 26.14 5. 976e- 








09 350-387 


1075 


BL01009 


Extracellular proteins 


BL01009D 14.19 4 . JUUe- 






SCP/Tpx-l/Ag5/PR-i /Sc7 j 


20 127-148 BL01009A 






proteins . 


13.75 6.586e-13 57-75 








BLO1O09E 13.50 1.439e- 








11 159-175 


1077 


PR00724 


CARBON Y PEPTIDASE C 


PR00724A 10.91 l.OOOe- 






SERINE PROTEASE (S10) 


08 366-379 






FAMILY SIGNATURE 




1078 


BL00215 


Mitochondrial energy 


BL00235A 15.82 l.OOOe- 






transfer proteins. 


12 170-195 BL0021SA 








15.82 7.529e-10 79-104 


10-79 


BL00678 


Trp-Asp IWD) repeat 


BL00678 9.67 4.316e-09 
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SKO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS ' 






pro* cms proteins. 


2S8-30£- 


10P: 


BL00326 


Tropomyosins proteins. 


BLO0326A 14 . 01 7.39£e- 
10 23-51 


2094 


B), 00460 


Glutathione peroxidases 
se: enocysteine proteins. 


BLO046OA 28.67 3.204e- 
18 57-92 HL00460B 
9.73 6 .400e-l3 100-118 
BL00460D 16 . 89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1Q91 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG4 4 8 PI LB 
FIMBRIA 7 RAN. 


PD02811A 20.67 3.017e- 
22 67-10S PD02811B 
17.07 2 .263e-21 118- 
151 PD02631C 13.25 
5.696e-13 154-167 




PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG44e PI LB 
FIKERIA TRAK. 


PD02811A 20.67 3.017e- 
22 60-98 PDC2811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


ICS" 


BL00479 


Phorbol esters / 
didcylglyceroJ binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 20C-2U 


nee 


PF00881 


Nitroreductase family. 


PF00861A 27.15 9.229e- 
13 111-147 


lies 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857B-09 185- 
208 PR00449D 10.79 ! 
8.364e-09 131-145 




PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11 .83 5.737e- 
20 42-60 PR00405A 
17 .71 2 .*?03e-17 23-43 
PR004O5C 19.41 6.902e- 
10 53-8t 


11 at 


BL003 55 


HMG14 and HMG17 
proteins . 


BL00355 S.97 2.528e-25 
20-51 


ill'; 


BL00355 


HMG x 4 and H14G17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


n?c 


BL00107 


Protein kinases ATP- 
binexng region proteins. 


BL00107B 13.31 4.857e- 
10 290-30f 


112:: 


PRD0432 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


112E 


PR001B6 


HEMERYTKRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1125- 


BL00170 


Cyciophil in- type 
pept idyl -prolyl cis- 
trans i some rase 
sicnatur . 


BLO017OC 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL0017OA 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-cnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1 .360e-14 59-80 


1132 


BL66S78 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp <WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adapror 
complexes medium chain 
proteins. 


BL0099OC 18.78 4.176e- 
38 235-269 BL00990A 
21 .44 4 .316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BLO0990D 
16.13 5.320e-18 403- 
422 


1137 


PRO 03 14 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 200-128 PRO0314D 
9.66 3.53le-33 233-261 
PRO0314C 16.05 8-909e- 
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1 SEQ ID NO: " 


ACCESSION 
NC. 


DESCRIPTION 


RESULTS* — \ 


r 




1 


32 159-188 PR00314A 
14.53 1.281e-22 13-34 


j ai3« 


B1.01115 


GTP-bmdmo nuclear 
protein ran proteins. 


BLOillSA 10.22 6.36<ie- j 
13 13-57 


1143 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.00Ce- 
19 451-482 BL00107B 
13.31 3.D77e-l2 519- 
53S 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 23-42 


115S 


FD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB. 


PD01652B 8.50 9.396e- 
10 522-574 PD01652E 
8.50 9.463e-10 740-752 


1157 


PD02B94 | HYDROLASE N4- PRECURSOR 

1 PROTEIN SIGNAL BE. 
1 


PD02694A 21.96 7.873e- 
26 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 1 GMC oxidoreductases 
| proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
17C 


1161 


PD01937 . DMA PROTEIN POLYMERASE 
| ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


PD01937 I DNA PROTEIN POLYMERASE 
| ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11 .94 7 .45Se- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
33 V 


1167 


BL00226 


Intermediate £i laments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phoephatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA VfD-4 0 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR0032OC 
13.01 7,840e-10 205- 
220 PR00320B 12 .19 
B.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.l00e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4 .150e- 
19 765-784 


1181 


BLO0291 


Prion protein. 


BL00291A 4.49 8.962e- i 
11 152-187 | 


1184 


BLC0720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- | 
18 1089-1113 j 


1185 


BLC0215 


Mitochondrial energy 
transfer proteinB . 


BL00215A 15.82 4.553e- { 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 1 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BLO0983 


by -6 / u-PAR domain 
proteins. 


BL00983C 12.69 2-761e- 
10 77-93 


1188 


BLO087B 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment ! 
si. 


BL00876B 10.95 6.000e- 
16 289-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19-67 
3.625e-13 379-402 
BL00878D 16.56 1.62le- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C j 
20.01 l.OOOe-il 224- 

252 J 


1193 


PRO0345 


STATHMIN FAMILY 
SIGNATURE 


PRO034SB 7.12 2.800e- , 
28 72-101 PR00345B j 
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SEC -'D MO: 


access i or: 

NO. 


| M'RIPTION 


RESULTS » | 






1 
1 

1 


8.54 7.f,S2e-28 149-174 ' 
PR0C345C 4.54 9.100e- 
28 Kl-125 PR00345E 
10.97 1 .964e-24 125- 
149 PR00345A 23.46 
S.645e-26 43-62 




PR0034S 


f TATKMIN FAMILY 
S3 CJJATURE 


PR00345B 7.12 2.800c- 
28 108-237 PR00345E 
8.54 7.652e-28 185-21C 
PR00345C 4.54 9.100e- 
2B 137-161 PR00345D 
10. 97 1 . 964e-24 161- 
185 PRO0345A 13.46 
5.645e-16 79-98 


119E 


PF00995 


Seel family. 


PF00995B 17.37 1 . 120e- 
13 224-264 


119* 


BL009S2 


Bacterial - type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6.738e- 
11 15-47 


Hi'.' 


BL01296 


Di iiydrodipi col ina t e 
reductase proteins. 


BL012987i 13.90 5.959c 
09 51-73 


120? 


BL00061 


Shi; i t -chain 

dehydrogenases/reductase 
s family proteins . 


BLC0061B 25.79 l.OOOe- 
14 152-190 


12CX. 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


120* 


BL01103 


ub:E/C005 

methyltransferase family 

pr t:t e ins . 


BL01183E 21.31 1.429e- 
37 164-229 BL01183D 
27. 71 8 . 535e-27 264- 
307 EL01183A 13 .25 
3.2S0e-23 51-73 
BL01183C 10.77 5.295e- 


1201: 






09 246-258 


BL00979 


G-^rotein coupled 
receptors family 3 
prct eins . 


BL00979L 20.63 2.485e- 
09 105-346 


120S- 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-6S P700023B 
14 .20 1 . 8l8e-09 45-55 


121* 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00040A 10.52 7.750e- 
14 227-241 PR00048A 
10. 52 4 . 316e-ll 199- 
213 


123 3 


PR00450 


RECOVEHIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.S06e-09 56-78 
•PR004S0D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neurcmodul in (GAP- 43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR004S6 


RIEOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3 .06 5.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BINDZ. 


PDOO066 13.92 7.231e- 
15 295-308 PDO0066 
13.92 7.231e-15 406- 
419 PDOC066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00O66 
13.92 3.348C-11 350- 
363 


1223 


BL50058 


G-protein gamma sv.bunit 
profile. 


BL5O058 27.23 l.OOOe- 
40 13-61 


I22fc 


BL00412 


Neuronodulin (GAP-43) 
prctems. 


BL0O412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL0O437A 13.82 l.OOOe- 
40 49-101 BL00437B 
16.28 l.OG0e-40 114- 
168 BL00437C 21.86 
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SEQ JD NO; 


ACCESS I OK 
NC. 


DESCR1 PT10N 


KES JLTS * 








BL00437D 25.72 1. OOOe- 
1U trio J x l _ 

23 . 95 1 . OOOe-40 327- 
379 


X, £ 3 


ELUl i DL. 


/\1 neb I I J 1 AUIll C Jid 1 - 

repeat proteins. 


XjJLiV JL.JL \>\Jx> X ^ . C . ^ ;/ / tr - 

10 5-60 




DDO A*7 "5 C. 


FAMILY 8 SIGNATURE 


i J\ V V / J —>f\ — X. - X J 0.03/t, 

09 391-405 


1 9 ^ 9 


DD find Q"j 


NfTlIVRODVn T. rVTncni 

r» r«u x f\\Jr n x x» k,j i\J2zkjl> 
FACTOR F4 0 SIGNATURE 


10 158-176 


1233 


PR004 97 


NEUTROPHIL CYTCSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


123S 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-321 


1237 


BL00027 


' Honeobox 1 domain 
proteins . 


BLQ0027 ^b.4J l.BlBe- 
21 36-79 


1243 


PR0040j 


WW DOMAIN SIGNATURE 


PR00403B 22.19 1.104e- 
11 10-25 


1246 


FD01166 


SYNTHETASE LI GAS E 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
30 31-46 PD0116RL 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BLOOOlf- 


EF-hand cal cium- hi nding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL001 33 


Ubiquitin-conjugnt inc 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 S.C70e- 
11 B-52 


1256 


BL0037? 


Phosphor ibosylg lyc inamid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1256 


proooi: 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.217e- 
10 174-193 


125S 1 


BL0 051U 


Zinc finoer, C3I1C4 type 
(RING finger) , proteins. 


BL00510 12.23 6.286c- 
10 31-40 


1261 


PROO07C 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-327 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
32 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpept idase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-383 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 


1263 


BL00038 


My c type, ' helix- ioop- 
helix' dimerization 
domain proteins. 


BL00038B 16.97 9.4S5e- 
11 62-83 


1264 


BL0111L 


GTP-binding nuclear 
protein ran proteins 


BL01115A 10.22 5.670e- 

X x X. 1 - D X 


1266 


PR00837 


ALLERGEN VS/TPX-1 FAMILY 

OT PXIHTtlDT 

olvNAl UKt 


PR00837C 17.21 2.714e- 

Xa Xo^—±OZ rKUUOJ /« 

14 .77 4 .512e-12 86-105 
ppnnfl^in it 19 7 clip _ 

12 201-215 


1269 


PR0044S 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL0O276 


Channel tormmg colicins 
proteins. 


BL00276A 8.87 l.bOOe- 
09 17-29 


1^75 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9-769e- 
09 220-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEC? ID NO: 


ACCESS JON 
NC 


DESCRIPTION | RESULTS* 






SIGNATURE 


12 119-135 PR0041^C 
11 . 30 1 . 857e-ll ICS 
179 FR00412A 13 .2 3 
3 . 400e-ll 100-11? 


127'/ 


FF007bt 


Putative esterase. 


PF00756C 14.12 9.538e- 
10 127-157 


127 9 


RIjOOl 34 


Serine proteaset , 
trypsin family, 
histidir.e proteins. 


BL00134A 11.96 9.32be- 
13 128-145 


128C 


BL0122C 


Phosphat idylethanol ami ne 
-binding protean family 
proteins . 


BL01220C 14.75 9.346e- 
15 248-276 


1285 


BJuOObl t 


Zinc finger, C3KC4 type 
(RING finger) , proteins. 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00793 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.1B2e- 
11 288-343 


1292 


PR008C/ 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PRO08O2B 16.51 1.610c- 
10 81-105 


1297 


PR0071fc 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PRO0716C 17.65 5.696e- 
09 23-44 


1298 


EL0047t 


LJM domain proteins. 


BL004/8B 34.79 6.478r~ 
14 268-283 


1301 


BL0O127 


Pancreatic ribonuclease 
fcmily proteins 


BL00127C 31.49 3.571c- 
28 82-126 BL00127B 
26.57 8.800e-28 23-66 


1302 


PRO 06 3 7 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- \ 
09 290-306 


1307 


KL002 3 E) 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 b.SOOe- 
17 13-38 BI.00215A 
15.82 1. OOOe-16 22L- 
251 BL00215A 15.82 
2.658e-13 107-132 


130e 


PRC089e 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.68?e- 
09 552-572 


1309 


PDC0301 


PROTEIN REPEAT MUSCLE 
CALCIUM- BI . 


PD00301B 5.49 2.731e- 
09 390 401 


1310 


BL00963 


Ly-6 / u-P/J? domain 
proteins. 


BL00963C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3 . 132e-09 12-22 


1313 


3L00194 


Thioredoxm family 
proteins . 


HL00194 12 .16 1 .900e- 
11 15-28 


1314 


BL00S94 


Aromatic amino acaris 
permeases proteins. 


BL00594A 16.73 8.96Se- 
10 53-97 


1316 


3L00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.32Se- 
13 128-145 


1320 


BL00763 


Ribosomal protein L13 
proteins. 


BL00783C 22.43 6.559c- 
24 07-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadil lo/beta - ca t en i n - 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 






Eukaryotic RWA-bindang 


BL00030A 14.39 6.254e- 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CY70SOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 7.239e- 
09 2S-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4 .930e- 
09 317-337 


1333 


PD0106 6 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 6.76Se- 
33 10-49 


1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PR00700D 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 




NC. 










PHOSPHATASE SIGNATURE 


05 211-230 


1340 


PR 00860 


VERTEBRATl 


P3C0860A S.46 5.034e- 






METALLOTHICNEIX 


13 5-18 






SIGNATURE 




1341 


BLOOBS? 


mutT domain proteins . 


BL0Q893 18.29 6.7S0e- 








16 46-71 


1343 


BL0128? 


BIR repeat proteir.s. 


RL0I2B2E 30.49 5.974e- 








21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 


DM00099B 14 . 73 6 .3i3e- 






TERMINAL 


09 417-427 






DIHYDROPTER1DINE. 




134S 


BL00523 


Aspartate and glutemate 


BL00923B 11 .41 5.935e- 






race.mases proteins 


10 335-146 


1346 


PF006 51 


BTB Ialso known as BR- 


PF00651 15.00 7.231e- 






C/Ttk) domain proteins. 


13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 


PR00193D 14 . 36 3 .S72e- 






SIGNATURE 


32 416-445 PR0O193C 








12.60 6.318e-31 179- 








207 PR00193B 11.69 








3.971e-24 133-159 








PR00193E 15.47 9.069e- 








22 470-499 PR00193A 








15.41 1 .783e-20 77-97 


1352 


PR 004 4 7 


NATURAL RESISTANCE - 


PR00447E 9.73 1.554e- 






ASSOCIATED MACROPHAGE 


15 299-319 PR00447D 






PROTEIN SIGNATURE 


13.54 3.408e-15 200 








224 PR00447A 12.73 








6.357c-ll 97-124 








PR00447G 6.69 9.877e- 








10 353-373 


1353 


BL0O3O3 


S-loo/lCaBP type calcium 


BI.00303A 21.77 6 .667e- 






binding protein 


26 45-82 BL00303t 








26.15 1 . OOOe-24 93- 130 


1355 


BL00039 


DEAD -box subfamily ATP- 


BL00039D 21.67 5.95Ce- 






dependent helicascs 


29 375-421 BL00039A 






proteins . 


18.44 7.136e-29 99-138 








EL00039C 15.63 4.000e- 








18 225-249 BL00O39R 








19.19 3 .I82e-14 14: - 








167 


1357 


PF0O615 


Regulator of G protein 


PF00615B 16.25 2.2l6e- 






oignalling domain 


12 84-101 PF00615C 






proteins. 


10.06 8.4l2e-12 16^- 








176 


1360 


PDC1066 


PROTEIN 21NC FiNCiEK 


PDC1066 19.43 9.234e- 






ZINC- FINGER METAL- 


29 10-49 






BINDING NU. 




1361 


PR0O92S 


NONHISTONE CHROKOSOMAL 


"PR00925A 5.47 5.091e- 






PROTEIN HMG17 FAMILY 


18 14-29 PR0092SJ& 






SIGNATURE 


3.73 6.143e-14 29-42 








PR00925C 5.57 4.789e- 








12 53-64 PR00925D 








6 .56 l.BS7e-10 76-87 


1362 


BL01272 


Glucokinase regulatory 


BL01272B 19.61 6.870e- 






protein family proteins. 


30 136-171 BL01272C 








11.68 3.JJ4e-25 249- 








274 BL01272A 6.49 








1 . 23le-18 99-117 


1363 


BL01272 


Glucokinase regulatory 


BL01272B 19.61 6.670e- 






protein family proteins. 


30 113-148 BL01272C 








11.68 3.314e-25 226 








251 BL01272A 6.49 








1 .231e-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION 


DMO0179 13.97 5.304e- 






T-CELL . 


09 167-177 


1368 


PR0O169 


POTASSIUM CHANNEL 


PRO0169A 16.77 1.522e- 






SIGNATURE 


09 76-96 j 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- | 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCXIFTIOI, 


RESULTS * 






j 10 1-19 


l37j 


BLO0242 


Integrins aipha chain 
proteins . 


BL00242B 6.23 8.615c- 
09 469-475= 


137? 


PR00625 


CNAJ PROTEIN FAMILY 
SIGNATURE 


FR00625E 13.48 7.353e- 
19 46-67 PR0O625A 
12 .84 1 .3?le-16 14-34 


1373 


BL00434 


HSK-type DNA-binding 
domain proteins. 


EL00434C 23 . 95 3 .770e- 
09 90-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR - 
ASSOCIATE. 


PD02475A 23 .18 B . 552e- 
10 1111-115C 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 15.43 9.S73e- 
32 24-63 


1380 


BL00194 


Thioredoxin lamily 
proteins. 


ELC0194 12.16 8.32 3e- 
12 48-61 


1381 


DM01S70 


0 kw ZK632.12 YDR313C 
ENDOSOMAL 111 . 


DM01970B 0.60 1.456c- 
15 1123-1136 


1383 


BL00676 


Trp-Asp (WD) repeat 
proteins oroteins. 


BLO0678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins pToteins. 


BL00678 9.67 7.600f=-10 
271-282 


1385 


BL003G3 


S-100/lCaBP type calcium 
binding protein. 


3LO0303B 26.15 6.2C3e- 
10 95-13^ 


1386 


BL0116C 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5-042e- 
09 1574-162* 


1387 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


61,00518 12.23 l.OOOe- 
11 52-61 


1385 


PD01066 


tlDATCTM 7T\ifi C T W/TTi 

rKUlfcllH 6l.NL rJ.NL>r*-'< 
ZINC- FINGER KETAL- 
BINDING NU. 


30 10-49 


13 90 


PD01 066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3 . 83 9.723e- [ 
10 127-13'/ 


1393 


PR00380 


K1NESIN HEAVY CHAIN 


PR00380A 14 . ie 9 625e- 

Z--> DO iik rnLUJDUi/ 

9.93 2.406e-20 304-326 
PRO038OB 12.64 4.414e- 
16 208-226 PR00380C 
13 .18 6.536e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC* FINGER 
METAL- BIND! . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13 . 92 8. 8006-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 
FD00066 13 .92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


139B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PDC1066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-29C 


1406 


PD00930 


PROTEIN GTFASE DOMAIN 
ACTIVATION. 


PDO093OA 25 .62 7.324e- 
15 363-385 


1407 


BL00020 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00039A 11.19 9.550e- 
11 179-193 PRO0019A 
11.19 8.826e-l0 228- 
242 PR00019B 11.36 
1 .360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








09 176-190 


1A01- 


PR00510 


NEBULJN SIGNATURE 


PR00510A 9.05 4.15Ce- 
12 182-202 PR00S10B 
12.56 8.767e-12 210- 
230 PR00510F 9.88 
8.i72e-10 S8-7S 
PR00510D 9.21 2.367e- 
09 251-267 


141C 


PD00078 


REPEAT PROTEIN ANK 
NUCI^AR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL0C3 58 


Ribosomal protein LS 
proteins . 


BL00358B 22.76 l.OOCe- 
40 57-103 BL00358C 
13.75 6.087G-14 122- 
136 BL00358D 14.26 
5.500e-l3 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


ELO0282 


Kazal senne protease 
inhibitors family 
proteins . 


BL00282 16.86 7.338e- 
10 511-534 


142 L 


BL00023 


Type 13 f ibronect xn 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


141'. 


PR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


34H- 


DH0 0973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCbOHEXlMIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


3 4 3.' 


FRO0319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 l.S71e- 
09 428-443 


142< 


FD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP . 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049C-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PE01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-l5 1038-1093 


1421 


1^00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


14 21- 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3 ) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BLS0002A 
14 .19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-256 


142L 


PI- 0062 8 


PHD- finger . 


PFC062B 15.84 3.04Se- 
12 330-345 


142t 


PF00628 


PHD- finger . 


PFO0628 15.84 3.045e- 
12 377-392 


3 42 7 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PRO0405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


142? 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PRO0320C 13.01 8.920e- 
10 577-592 


1430 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16. B6 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR0D928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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KKO HI NO: 


ACCESS ION 
NO. 


DESCRIPTION | RESULTS* | 

i 


t 


PROTEIN SIGNATURE 


10 103-124 




BL011 3 3 


Ciq domain proteins. 


BL01113B 18.26 7.049e- 

15 14- 50 BL01113C 

13 .18 7 . 000e-12 82-102 


1434 


PR0033^ 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BLOOOfiO 


Eukaryotic RNA-bindmc 
region RNP-1 proteins. 


BL00020A 14.39 :.000e- 
12 84-103 


343f> 


BL00290 


immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 25C-268 BL00290A 
20 .89 4 .O00e-09 188- 
211 


M4C 


PROOeOG 


VINCULIN SIGNATURE 


PR008063 4 .26 4 .960e- 
09 38-52 


1441 


PK0DB06 


VINCULJN SIGNATURE 


PR008D6B 4 .28 4 . *60e- 
09 88-102 


3444 


EL00422 


Granins proteins. 


BL00422D 19.48 1.000c- 
08 114-138 


1445: 


PDOlS^l 


PHOS PHORYLASE KINASE 
ALPHA MUSCL. 

i 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
34.35 1 .O0De~40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 
PD01841E 13.36 l.OOOe- 
40 296-345 PD03841G 
24 .26 1 .000e-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
38.42 l.OOOe-40 1083- 
1125 PD01841E 38.60 
9.719e-38 258-296 
PD01841X 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21 .30 3 .189e-33 435- 
472 PD01841C 13.78 
l.OOOe-25 185-206 
PD01841M 10.82 3.25De- 
20 1175-1194 


1446 


PF008K : K-NS his tone family . 


PF00816B 13.84 8.875e- 
09 390-220 


3447 


PR0004E 


C2H2 -TYPE ZINC KINGKfc 
SIGNATURE 


PRO0O48A 3 0.52 2 0 80e- 
09 402-416 


1446 


DMO033T 


072 R I BONUCLEASE 
INHIBITOR . 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BI.O0O3C 


Eukaryotic RNA-binding 
region RNP-l proteins . 


BL00030B 7.03 2.600e- ; 
10 94-104 | 


14 54 


DM016 8f 


2 POLY-1G RECEPTOR. 


DM01688D 13.44 "7.146e- 
09 382-405 


1455 


PF0077-) 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-5S 


14 57 


BL.00 92 7 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


146C 


BLO054!: 


Aldose 1-epimerase 
proteins. 


BL00545C 11.28 7.353e- 
17 169-182 BL0054SA 
10.20 2.071e-lS 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PRO0097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL0112S 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL0079C 


Receptor tyrosine kinase 
class V proteins. 


BLO079OI 20.01 2.821e- 
09 2114-2145 


1475 


PF0068£ 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- 
09 267-277 ) 
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SF.0 ID NC: 


ACCESS ION 
NO 


DESCRIPTION 


RESULTS* 


j 




Probable rabGAP r-omam 
proteins . 


PF0CS66A 12.64 7.33je- 
10 466-476 




BL0003C 


l.ukaryctic RNA-bi.icmc 
region RNP-1 proteins. 


BL00C30B 7.03 9.400e- 
10 43-53 




DM0040£ 


GLIADIK . 


DM00406 7.73 8.541e-I0 " 
292-305 


1480 


BL00290 


Immunoglobulins and 
major histoccn.pat ability 
complex proteins. 


BL00290B 13.17 2.385c- 
15 69-87 BL0029DA 
20.89 5.091e-ll 12-35 


KiSl 


PR00150 


FHOSPHOENOLPYRUVA71 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.03Se- 
09 21-51 


i4ei 


PF00780 


Domain found in N1X3- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


14 83 


BL01160 


Kmesin light chain 
repeat proteins. 


BL01160B 19-54 l.lS3e- 
09 108-162 




PD01066 


PROTEIN ZINC F1NGEK 
ZINC-FINGER METAL - 
BINDING NU- 


PD01066 19.43 S.9D9e- 
25 17-56 


148C 


BL00107 


Frotein kinases AT?- 
binding legion proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


14 8£ 


BL00039 


CEAD-bCx subfamily ATP- 
dependent helioases 
proteins . 


BL0D039D 21.67 9.586e-" 
10 116-162 


1490 


BI.00166 


Enoyl-CoA 

hydratase/ isomers si- 
pro teins . 


BL00166D 22.87 2.607e- 
24 190-226 BL.00166C 
18.93 5.5D0e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 j 


1 4 S'.v 


BL004 S2 


Guanylat e cyclases 
proteins . 


BL00452D 28.59 3.700e- 
31 63-106 BL00452E 
11.92 3.045e-13 11b- 
131 


14 ^2 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP • 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
11 184-400 BL00107A 
18.39 5.345e-ll 322 
353 


1SOC 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


ibc: 


BL00027 


' Homeobox • doma i n 
proteins. 


BLO0O27 26.43 4.789e- 
24 112-155 


150? 


BL00027 


' Hnmeobox ' doma in 
proteins . 


BL00O27 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin doma:n 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 S.333e-19 402- 
421 BL01177B 13.61 
7.840C-16 155-171 
BL01177D 17.50 1.9D0e- 
15 427-445 


1506 


BL00972 


Ubiquitin carboxy] - 
terminal hydrolases 
family 2 proteins. 


BLO0972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL»00972E 20.72 6.75Se- 
10 341-363 


1512 


31,00523 


Sulfatases protemt . 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-S2 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 16B-21B 


1516 


3L00600 


Aminotransferases ciass- 
III pyridcxal-phoEphate 
attachment si. 


BL006OOA 17.98 6\l43e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* j 
l 








331 BL006C0G 12.41' 
9.625e-17 377-356 ! 
BL00600B IS. 60 5.0Sle- 
15 160-186 BL006OGC 
16.18 6.04Ce-12 I9C- 
206 BL006C0F 8.77 
1.000e-ll 343-356 
BL00600D 8 .71 1 . 000e- 
10 281-295 


1523 


PD0O93C 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


152U 


PR0C32O 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PRO032OB 12.19 4.774€- 
11 192-207 PR00320E j 
12.19 8.839e-ll 272- j 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320& 16.74 j 
8.683e-09 272-2B7 j 
PR00320C 13.01 8.800e- 
09 106-121 


1538 


DM01970 


0 kw 2K63 2.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0 .60 4 .508c- ] 
15 171-184 '» 


153S 


PF0O781 


Diacylglycerol kinase 
catalytic domain 
proteins (presuned) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURF 


PR00965H 10.73 1 . 231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.96 
l.l23e-28 209-231 
PR0096SC 15.04 1 . OOOe- 
27 131-151 PR00965D 
5.84 1 .O00e-27 150-170 
PR00965G 8 .52 2 .440e- 
27 258-279 PR00965B 
4 .80 8 .650e-26 88-109 
PR00965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


154 J 


BL01013 


Oxysteror- binding 
protein family proteins . 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 1 . OOOe- 
40 599-646 RD02699A 
8.91 2 .286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102C-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
13.94 8.714e-40 142- 

1*1*7 DT k 1 t TO 

1 .OOOe-38 2-38 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiguitin-act ivat ing 
enzyme proteins. 


BL00536P 13.65 6.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696c- 
18 248-279 


1549 


PR00139 


AS PARAG INASE/ GLUTAM I NASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 S.llSe- 
09 58-73 
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j SEQ 1 D NO : 


ACCESS I ON 
NO. 


DESCRIPTION 


RESULTS* 


15S6 


BLOOOfe: 


Snort-chain 

dehydrogenases/ rpciuctase 
s family proteins 


9L00061B 2S.79 6 . 276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof : <>mily 
proteins . 


3L01228D 17 .44 8 . lCbe- 
12 107-132 


1558 


BL0122S 


Hypothetical cof iotr.ily 
proteins . 


BL01228D 17 .44 8 . lOSe- 
12 107-132 


1559 


BL01226 


Hypothetical cof iamily 
proteins . 


3L01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase lazily X 
proteins . 


BL00522C 11.90 6.6O0e- 
18 412-436 BL00S22P 
27.30 1.738e-16 360- 
410 BL00522A 25 . 52 
6.000e-16 279-32( 
BL00522E 19.63 6 . 123e- 
14 502-532 BL00522F 
14 .90 2 .385e-33 550 - 
575 


1563 


PK006S1 


BT3 (also Ancwn as BR - 
C/Ttk) domain proteins . 


PF00651 15.00 1.94 Ve- 
il 46-59 


1564 


BL0029S 


Ubiquitm domain 
proteins . 


BL00299 26.84 2.823e- 
10 324-376 


156b 


BL01013 


O^ysterol -bindmc 
protein family proieins. 


RL01013D 2S . 81 8 S94e- 
17 184-228 BL01013C 
9.97 4.906e-12 14-24 


1567 


BL00676 


Trp-Asp I WD) repeat 
proteins proteins 


BL00678 9.67 3.400e-10 
378-389 BL00678 9. 67 
5.BO0e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


BL0O47S 


Phcrbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.5? 5.235e- 
17 297-313 BL00479A 
19.86 6 . 625e-15 27j - 
294 EL00479A 19 . Si 
2.667e-14 147-170 
BL00479B 12.57 6 . 294e- 
12 173-189 


157f 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 

• 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1 .200e-22 13B-155 
PR00665F 11.73 4.000e- 
22 337-354 PRC0665C 
5 . 89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665F 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 

TCOMT MTV t 

D3HYDR0PTERID1NE. 


DM00099B 14.73 9.308e- 
10 12 / - 1 3 f 


1579 




Sott.s tomec in B coma x r, 
proteins . 


&L>\)Ub AUK ^?.t>i> o. 7 / be- 

14 52-73 


1 58C 


Jr WZ 


HXDKOLASfc. N4 - PKfcLuRSOR 
PROTEIN SIGNAL BE. 


rDOZBJJic 13. 1*3 b.roye- 
16 182-215 PD02894A 

<C Jt . JO £ . i£3C 1U J I 1UJ j 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411K 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS I A AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


15B5 


DM01551 


kw OSTEOINDUCTIVE YOPrt 
MEMBRANE OUTER • 


DM01551C 14.62 9.455e- 
11 125-145 


1S86 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM013S4S 11 .61 7.750e- 
09 474-495 
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SEC ID NO' ~j 


ACCESSION 
NC. 


DESCRI PTIOK 


RESULTS * 


3587 


PR00072 


MALIC ENZVKE SIGNATURE 


PR00072E 13.77 7.955k- 
33 180-230 PR00072.-. 
12.75 6.040e-25 320- 
145 PR00072C 11.42 
2.286e-24 216-23* 
PR00072D 10. 77 3 400e- 
22 276-295 PR00072F 
10.54 1 ,360e-19 301 ■ 
318 PR00072G 30.45 
5.304e-lS 433-450 
PR00072F 8.37 5.93 5e- 
15 332 -3 '.9 


1SBS 


EL00191 


Cytochrome bS family, 
heme- binding domain 
proteins . 


BL00191K 15 . 64 1 ,537e- 
22 51-113 BL00191K 
17.38 5.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632 .12 YDR313C 
EN DOSOMAL -11 


DM0197CB 8.60 7.716e- 
13 213-224 DM01970^ 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-3193 DM00517A 
8.21 1.000e-ll 1015- 
1026 




BL00037 


Myb DNA-bjndmg domain 
proteins repeat proteins 
proteins . 


BL00037H 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 Bl>O0C37D 
15.92 3.526C-11 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


159? 


BL00026 


Zinc fincei, C2H2 type, 
domain proteins . 


BL00028 36.07 1 .514e- 
09 110-127 


1598 


PF00626 


FHD-f ingci . 


PF00628 15.84 3.2S0e ! 
11 1667-3682 


1595 | PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR0DO14D 12.04 5.500e- 
09 980-9?5 


160C 


BL00518 


Zinc fincer, C3HC4 type 
{RING finger) , proteine. 


BL0D518 32.23 6.571e- 
10 30-35 


1602 


BL00412 


Ncuromodulm (GAP-4 3) 
proteins . 


BL00412D 16.54 5.402s- 
10 136-167 


1605 


PF00651 


HTB (also known as BR- 
C/Ttk) domain proteins. 


PF006S1 35.00 3 .571C- 
10 44-57 


1607 


PL00252 


Interferon alpha, beta 
and delta family 
proteins . 


BL00252A 18.49 6.65/p- 
23 20-57 BL00252B 
19.78 9.325e-16 58-109 


1610 


DMO0215 


PR0LINE-R1CH PROTEIN 3. 


DM00215 19.43 1 .000t>- 
08 61-94 


1613 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-12S BLOO504D 
1.47 6.030e-09 327-168 


1612 


PF00168 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1613 


BL00412 


Neuromodulin (GAP-431 
proteins . 


BL00412D 16.54 6.0Sle- 
09 932-953 3L00412D 
16.54 7.3S3e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL005591 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.S57e-18 197- 
224 BL0C5S9J 19-63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-264 


161S 


PD01427 


TRANSFERASE 
METHYLTRANSFERASE Bl . 


PD01427B 22.45 3.02be- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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NO. 


DESCK1 PTIOK 


RESULTS* | 
i 


! 




47. 


1616 


BL00115 


F.ukaryctic RNA 
polymerase il 
heptapeptide repeat 

rtr^hfti tic 

proc e l jij> . 


BL00115Z 3.12 7.4 8Se- 
05 152-201 BL00115Z 
3 . 12 9 .6C3C-09 145-194 


hl61\ 

i 
i 


BL003O3 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 SI - 88 BL00303A 
2i .77 1 .947e-31 4-41 


i6ie 


BL01254 


Fetuin family proteins. 


BL012S4F 10.02 8.754e- 
0 9 137-147 


163S 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METKI. 


hr^oi ROOD ") CL "l A i nA/\« 

40 47-97 PDC1898C 
21 .56 7.000e-30 125- 
155 PD01800A 12.84 
8 . 500e-15 7-23 


2622 


PR00239 


MOLLUSCAN RHOPOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.56 3.455e- 
09 692-704 PR00239E 
1. 58 4 .580e-09 697-709 
PR00239E 1.58 4.5S0e- 
09 702-714 PR00239B 
1.58 5.193e~0S 703-715 


I62i 


PR00860 


VERTEBRATE 
METALLOTH1 OKEXN 
SIGNATURE 


PRDG860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1 .474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00794 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 6 . 027e- 
11 77-95 


1626 


BL00325 


Act m-depolymer izi ng 
proteins . 


BL00325B 21.66 l.OOOe- 
4C 93-139 BLC0325A 
24 . 83 6 .786e-23 61-93 


163: 


BL00064 


L- lactate dehydrogenase 
protein* . 


DL00064B 23.57 l.OOOe- 
40 82-130 3LO0064C 
17 . 28 1 . 000e-40 137- 
182 BL00064E 27.20 
1 . 000e-40 223-275 
BLO0O64F 25.14 7.B82e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
3_ 182-212 


1632 


PR00063 


R3BOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 25.24 9.7005- 
11 59-84 PR00063A 


1634 


PRO 023 9 


MOLLUSCAN RKODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PRO0239D 0.00 1.105c- 
11 36-49 PR00239C 


1636 


BLC1210 


caveolins pioteins . 


BL01210B 13.92 9.531e- 
30 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 S.388e- 
11 11-43 


1639 


BL01183 


uJbiE/COC?5 

methyl transl erase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM -POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
^TGNATHRE 


PR0001SB 9.04 8.468e- 
10 128-149 


1641 


PRO 03 20 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114C-10 279- 
294 PRO0320A 16.74 
l.€59e-09 279-294 
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| SEO 1 b NO : 


ACCFS^ION 
NO. 


DESCRI PTION 


RESULTS* 








PR003 2 0A 16.74 2.098e- 
09 229-244 


1642 


PFO0O23 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-13C 


1643 


FR0016S 


POTASS J UM CHANNEL 
SiGNATUHU 


PR00169A 16.77 • .806e- 
11 74-94 


1644 


EL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2 .200e-l0 
109-120 BL00678 9.67 
5.737e-09 526-539 


1645 


BL01108 


Ribosomal protean L24 
proteins . 


3L01108A 20.33 7 . 366e- 
17 56-89 


164S 


PR0038C 


KINESIN K2AVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 1C3-12S PR0038OD 
9.93 6 . 300e-18 386-408 
PR00380C 13.18 7,923e- 
16 332-351 PR0038OB 

310 


164 7 


I'M U 1 -J 4 J 


J J iiHtONa Nt- - INNA 
LIGASE. 


t~Vk/l niTjIOf IT 1 C Q T a 1 .» 

DnVltiQ^L. X I . J r> S>./yie- 
37 340-381 DM01242E 

505 DM01242D 23 .29 
3 .925e-30 42C-463 
DM01242B 23.57 H.l)54e- 

10.61 7.618e-14 526- 
54 0 


1G4D 


PD00126 


PROTEIN REPEAT DOMAIN 

TDD UTTIfM C TV 


PD00126A 22.53 S.500e- 
10 13-34 


165; 


EL01160 


Kmesin light chain 
repeat proteins. 


BL0116OB 19.54 t.720e- 

11 ill ^ DC 


1C52 


BL00933 


FGOY family of 
carbohydrate kinases 
prot cin? . 


BL00933A 17.50 4.673e- 
12 11-35 BL0C933E 
ij.ou y . <ti /c - u j SDo- 
472 


165 2 




Involucrin proteins. 


10 70-115 


16 54 




DdL i ei / ui type p/jy_utfjic 

dehydrogenase proteino. 


17 302-334 


1655 


BL00982 


Bacterial - type phyroene 
dehydrogenase proteins. 


BL00982A IB. 41 7.750e- 
17 282-314 


loot 




Guanine - nucl eot ide 
dissociation stimulators 
CDC24 family sign. 


lb b U / - b J U 


1657 


FR00449 


TRANSFORMING PROTEIN F21 
RAS SIGNATURE 


PR00-449A 13.20 7.938e- 
11 114-136 


1656 


PR00910 


LUTEOVIKUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


EL0t)972 


Ubiquitin carboxyl- 
terminal hydrolaseo 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 i Actins proteins. 


BL004O6D 12.56 6.767e- 








15 188-243 


1661 


PR00105 


CYTOSINE- SPECIFIC DNA 
METHYLTRANSFERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
l.OOOe-10 1305-1319 


1662 


BL0028O 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BLO028O 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUCJN) SIGNATURE 


PR00319D 11-64 6.62Se~ 
23 107-125 PR0C319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200^-19 70-85 
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SEO I D NO: 


ACCESSION | DESCRIPTION » RESULTS* | 
NO. ! 1 


1664 


BLOOOlfc 


EF-hand calcium-bind inc 
domain proteins. 


BL00018 7.41 S.OSOe-10 ; 
489>5Qi 


1667 


FD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU . 


PD01066 19.43 8.500e- 
38 7-46 


166S 


RL01153 


NOLl/NOP2/sun family 
proteins . 


BL01153D 19 . 69 1 . I88e- 
17 11S-141 BL011S3C 
13 .67 8.S77e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P8S 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.1O0e- 
10 1146-1169 


1672 


BL0059t 


Chron\o domain proteins. 


BL00S98 14 .45 8.500e~ 
20 27-49 


1673 


PR0032t 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PRO 004 9 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR0G049D 0.00 7.580e- 
11 343-358 PRD0O49D 
0.0C 1 .2B6e-10 342-357 


167f- 


PR00747 


GLYCOSYL HYDROLASE 
FZiMTl.V 4? *51GN5VTURF 


PR00747H 12.76 8.636e- 
io 4?7-44fl PR0D"747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.S00e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 

1 j 1 D - j U rK U VI f % / r 

13.56 8.714e-10 311- 
32t 


1677 




FAMILY 47 SIGNATURE 


ppnrtl A7U 15 "7<i ft - 

rKUU/'J / Jl X £. - 1 © R . O JOC 

19 309-330 PR00747G 
14.50 2.286e-18 250- 
27 5 PR00747C 12 . 06 
7.5006-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7 .65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL00676 


Trp-Aep (WD) repeat 
proteins proteins . 


BL00678 9.67 4.600e-10 
406-417 BL0067e 9.67 
6.684e-09 32D-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-2S4 


16 83 


PR00326 


GTPl/OBG GTP-B1NDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kineein light chain 
repeat proteins. 


BL011603 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3-06 8.125e- 
10 420-435 


169-, 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.28le- 
10 487-S02 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


165? 


BL00674 


AAA- protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 243-263 
BL00674D 23 .41 8 .560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR0040S 


PHTHALATE DI OXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12 .70 4 -388e- 
10 427-447 




PR00466 


CYTOCHROME B-24S HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466E 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL0002e 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 263-300 BL0002B 
16.07 3.765e-ll 255- 
272 BL00028 16.07 
S.154e-ll 171-188 
BL00C28 16.07 5.500e- 
11 227-244 BL00026 
16 .07 1.600e-10 19S- 
216 


1700 


BL01015 


ADP-rabosylation tactors 
family protein* 


BL01C19A 13.20 3.34Be- 
15 62-102 BL01019E 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN 2INC FINGEF 
ZINC-MNGER METAL - 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


17 07 


PRODI oy 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12 .27 4 ,558e- 
14 134-153 


1710 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.60Oe-09 113- 
127 PR00019B 11 .36 
7.120e-09 204-218 


1711 


BI.01155 


WW/rspS/KWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13 .85 5.408e-l0 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PFO06 4 2 


Zinc fin$er C-x8-C- x5-C- 
x3-H type (and similar). 


PF00642 11.59 5».550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9-550e- 
11 230-241 


1715 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BLO0353 


HMG1/2 proteins. 


BL00353C 14.83 6.01Be- 

10 136-183 BL003533 

11 .47 8.866e-09 86-136 


1719 


BL00412 


Neuronodulin (GAP- 43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL0003 8 


Myc-type, ♦helix-loop- 
helix • dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.50Oe- 
09 418-428 


1724 


Bh01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyl transferase signa. 


BL02279A 24 . 27 5.663c- 
12 233-281 


1728 


BL00016 


EF-hand calciun-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00S94 


Aromatic amino acids 
permeases proteins. 


BL00594A 16 .75 1 .089e- 
09 17-61 
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SF.Q ID NO: 


ACCESSION | DESCRIPTION 
NC. 1 


RESULT? * i 


1731 


BL0116C 


Kinesln li9ht chair, 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 296-35C | 


1732 
~1733 


BL0116C 


Kijiesin li^ht chair, 
repeat proteins 


BL0116C& 19.54 9.676e- ! 
10 316-370 j 


FF0085C 


Histone deacet yiasc 
family . 


PF00850F 15 . 70 4 ,349e- 
22 246-279 PF00850D 
14.76 6.6SCe-20 177- 
201 PF00850E 8.86 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG- Y DNA- 
bindlno domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00175 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5-263e- 
10 492-502 


1743 


FR00449 


TRANSFORMING PROTEIN P23 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2 . 241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


174 4 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.186e- J 
11 5-27 PR00449D 
10.79 2 .24le-10 109- 
123 PR00445E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine - nucl eot lclfc 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 B.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/ RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PROOOBIB 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3 .935e-10 150- 
168 


1747 


BL00439 


Acyltransf erases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 B.43Se- 
14 65-91 BL00439G 
13.40 2 . 695e-12 3-14 


1749 


PR00815 


CBXX/CFOX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND! . 


PD00066 13.92 3.400e- 

14 33-46 PD0O066 

13 .92 1 .000e-13 89-302 

PD00066 13.92 7.000e- 

13 61-74 PD00066 

13.92 6.571e-12 117- 

130 


1753 


BL01013 ■ 


Oxysterol - binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 


3L00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL007901 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-316 


1756 


PD01066 


PROTEIN ZINC FINGEK 
ZINC- FINGER M2TAL- 
BINDING NU. 


PD01066 19.43 9-750e- 
35 10-49 


1758 


DM004 06 


GLIADIN. 


DM00406 7.73 7.600e-05 
6S3-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR 1 . 


PD02929A 28.27 4.529e- 
09 224-2 JE 


1765 


PRO0326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proceir.s. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00542 


glpT family o: 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DMOC2 3 5 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SSQ ID NO: 
"1778" 


ACCESSION ! DESCRIPT1CK 

NC . ! 


RESULTS* 


BL0O084 


Ccpper type II, 
ascorbatc - cepe r.ci c nt. 
monooxycenases proteins . 


BL00084D 25.11 3.700e- 
20 169-224 BL00084E 
24.26 8.134e-16 10-56 
BL00084C 27.71 8.412e- 
11 107-I5B 


1779 


BL01013 


Oxysterol -binca nc 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BIiO1013C 9.97 
6.308e-l3 435-44* 
BL01O13B 11.33 3.717e- 
12 409-420 


1783 


BL0O741 


Guanine -nucleotide 
dissociation stimulators 
CDC 2 4 family sign . 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL0 0 741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family s.igr, 


BL00741B 14.27 8.138e- 
13 492-515 



♦ results include in order: accession number subtype; raw score; p- value; post ion of 
signature in amino tcid sequence. 
TRAD0CS:I4 16223. J(%CRJ0l!.DOC) 
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TABLE 4 



SEQ 1?. 
NO; 


P?AM NAME 


DESCRIPTION" 


p- value 


PFAM | 
SCORE i 


'2 


ic 


immuncg) ooul in coma in 


2 . le-2 2 


109-5 | 


\ 

l 


pk ma.s<- 


tukaryoLic protein kinase 
domai n 


1.3e-2Sr 


110.7 


4 


zf -C2H2 


Zinc finger, C2H2 type 


1 . 6e-21 


84.9 


5 


fn3 


Fibronectin type III domain 


C 


1097.1 




fn3 


Fibronectin type 111 domain 


0 


1035.0 


7 


In? 


Fibronectin type III domain 


0 


1090 .4 


~T 


fn3 


Fibronectin type III domain 


0 


1097.1 


_ 


TEC 


TBC eemair. 


4e-40 


146.7 


10 


p450 


Cytochrome P45C 


9.5e-l': 


62.0 




anx 


Ank repeat 


6e-2C 


79.7 


14 


ia 


Immunoglobulin domain 


1 . 7e-05 


22 .7 


If) 


zf -MVND 


KYND linger 


1 . 3e-06 


35.4 


16 


zf -MYNC 


KYND finger 


1 . 3e-0fc 


35.4 


17 


zf -C2H2 


Zinc linger, C2H2 type 


1.7e-9i- 


343 .9 


1 6 


CAP GLY 


CAP-Gly doT»ain 


1 . 2e-2£: 


98 . 7 


20 


1MPDH_ C 


j MP dehydrogenase / GMP 

1 CUUL lObt *~ LClllli.lUa 




410.5 


2 j_ 


1 Monu <~ 


IMP }\\/ r\ y ^v/^r f*T\j\ *i J HMD 
J. t 1 c ucuyui uyciiast / vjrir 

reductase C terminus 


4 3e-102 


352.6 


22 


u a. I iia^r 


domain 


2 . A e- 7S* 


27 7.0 


£ J 




doma j r. 


8 . 4 e- 74 


258 ,€ 


25 




♦ < 1» i~\ L/\^ _i y MIL 1 a CI >. W I J C* OUUUI11 L 


o 


1077 .7 






C 1 doma l J; 


1 . 9e- 1 C 


44 .4 


•> -» 


3 




7 6e-3i 


111.2 


2 b 


R I Ho soma! L»2 
3 


R a bor.oma 1 protein L23 


le - 29 


104 .2 


30 


zf -A20 


A20-like zinc finger 


1 . 5e-10 


48. 5 


31 


zf - A2 0 


A20~like zinc fmaer 


1 . 5e-10 


48.5 


32 


FMN dh 


KMN- dependent dehydrogenase 


5.4e-179 


608. a 


34 


PID 


Pliospho ty roc ine interaction 
domain (PTB/PID) 


3 59 


209.9 


3 5 




1 mtnunocjl obul in domain 


1 . 4e-l3 


48.8 


36 


iu 


Imtnunoglobul In domain 


1 4e-13 


48 . 8 


4 0 


k j nesin 


Kjnesin motor domain 


6 .7e-76 


265.6 


44 


Et£ 


Ets- domain 


1 4e-56 


182.1 


45 


Ets 


Et s- doma i r. 


1 . 4e-56 


182.1 


4 6~~~ 


I_RR 


Leucine Rich Repeat 


1 .7e-13 


58.3 


46 


zf -C2H2 


Zinc finger, C2H2 type 


2 3e-162 


552.8 


45 


IT AM 


Jnmunoreceptor tyroGine-based 
activation mot 


1.4e-05 


31.9 


SO 


UCH-2 


Ubiguitin carboxyl - terminal 
hydrolase family 


1 .le-26 


102.0 


&: 


UCH-2 


Ubiquitin carboxyl - terminal 
hydrolase family 


1 .le-26 


102.0 


52 


ras 


Ras family 


8.Se-45 


162.3 


53 


PRK 


Phosphcribulokinase 


2.1e-65 


230.7 


54 


myb_DNA- 
binding 


Myb-like DNA-binding domain 


0.096 


15.2 


SS 


voltage_CLC 


Voltage gated chloride channels 


3.3e-lB6 


631.9 


56 


cugartr 


Sugar (and other) transporter 


0.0001b 


-64.3 


57 


TBC 


TBC domain 


2.2e>37 


137.6 


5S 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


prr 
i 


PMP2 2_Claudi 
n 


FMP-22/EMP/M?20/Claudin family 


7.9e-49 


175.6 


I 66 


C2 


C2 domain 


7.9e-54 


192.2 


6S 


C2 


C2 domain 


2.3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


1 « 


19 


Immunoglobulin domain 


8.2e-2B 


94.7 


| 73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 
NO: 


P F AM NAME 


DESCRIPTION 




PFAM 
SCORE 


i 


"dctnair: 






7<? | pxinese 


Eukaryotic protein kinase 
doma 1 n 


2 .Be-3S 


140.6 


7* 


zl - 

C4_Topoisom 


Topoi somerise DNA bincmo C4 
zinc fine 


5.4t-S4 | 192.8 


83 


Pftpt jdase_£9 


Prolyl oligopepticase tamily 


4 . 3e-2C 


36 .8 


8^ 


ln:- 


Fibronectin type III domain 


4 . le-51 


183 .2 


ee 


SH2 


Src homology domain 2 


3 -le-22 


67.7 


19 


Immunoglobulin uorr.fcin 


0.0091 


14 . 0 ; 


09 


WD4 0 


WD domain, G-beta repeat 


2 .le-2i 


84 .6 j 


92 


j £iniijin_G 


Laminin G domain 


6 .le-27 


98 .5 


9Ti 


A-4P- binding 


AMP-binding enzyme 


2 .4e- :3 


-37.2 


95 


pr.:nasc 


Eukaryotic protein kinase 
domain 


1 .4e-59 


211.4 


9t 


pKinase 


Eukaryotic protein kinase 
domain 


2 .fee -Si 


183 .9 


97 


adh_short 


short chain dehydrogenase 


2e-63 


217.5 


-9 p 


>: 1 neoin 


Kinesin motor domain 


2.2e-80 


300.4 


103 




PTB aomain (IRS-l type) 


b.4e-3fe 


133 .0 


2 02 


AAA 


ATPases associated with various 
cellular act 


6 . 8e-05 


-5.2 


104 


pK : nase 


Eukaryotic protein kinasr. 
domain 


2. 7e-73 


256.9 


106 


ra* 


Ras family 


8 . 3e-24 


92 .b 


107 


FjVE 


FYVE zinc finger 


5.4^-27 


100.7 


106 


Cyt reductas 
e 


FAD/MAD- binding Cytochrome 
reductase 


7 .7e-61 


215.5 


10? 


zf -C2H2 


Zinc finger, C2H2 type 


2 .3e-l22 


420.0 


113 


pk j nasc 


Eukaryotic protein kinase 
domain 


4e- 8fc 


306.2 


lie 


p}; 


PH domain 


3.1e-13 


45.2 


117 


1 ipocal in 


Lipocalin / cytooolic fatty 
acid binding pr 


2 . 4e-l4 


53 .5 


lie 


pk j nose 


Eukaryotic protein kinase 
domain 


4 . Se-20 


76.3 


120 


WD4 0 


WD domain, G-beta repeat 


2.4e-14 


61 .1 


121 


WD4C 


WD domain, G-beta repeat 


2.4e-14 


61.1 


123 


lF5_eIF4_eIF 

7 


eZF4 gamma/eIF5/el?2-epsilon 


le-3? 


122.2 


124 


ic 


Imnunog] obulin domain 


6 .Se-Of 


30.6 


12 7 


mito_carr 


Mitochondrial carrier proteins 


3e- 16 


58 .6 


12 e 


PFiC 


Protein phosphatase 2C 


2 .2e-71 


250.6 


12 s 


ATPJG1 
AT* 


ATP1G1/PLM/MAT0 tamily 


3 .le-20 


80.6 


13C 


pfkb 


pf KB tamily carbohydrate kinase 


4 .Se-42 


137.1 ■ 


133 


ACBP 


Acyl CoA binding protein 


4 .6e-22 


86.7 


134 


rrrt. 


RNA recognition motif\ 


1.2e-3i 


118.5 


135 


IC 


10 calmodulin-bindmg motif 


2 .6e-08 


41 . 0 


136 


ATP1G1_PLM_M 
ATE 


ATP1GI/PLM/MAT8 tamily 


9.3e-22 


85.7 


139 


WKi 


Wiskott Aldrich syndrome 
homology region S 


0.006*; 


23.1 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1 .7e-82 


287 .5 


141 


Pept idase_S2 
6 


Signal peptidase 1 


5.7e-lC 


3 5\ 7 


14? 


art 


ADP-ribosylation factor family 


1.2e-39 


145.2 


14€ 


KRAE 


KRAB box 


7.3«-30 


112 .6 


l4fc 


DUFfc 


Integral membrane protein DUF6 


0.096 


8.0 


14S 


PDEase 


3'5' -cyclic nucleotide 
phosphodiesterase 


3 . Se-80 


231.1 


15: 


S4 


S4 domain 


1 .le-06 


42.3 


15:- 


LRNA-synt_ld 


tRNA synthetases class 1 (R) 


3.86-103 


356 .1 


154 


Cyt reductas 
e 


FAD/HAD-binding Cytochrome 
reductase 


7. 8e-60 


212.2 


15E 


ras 


Ras family 


3.6e-26 


107.0 


157 


actm 


Act in 


3.8e-26 


87.1 
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SEO I T J 
NO: 
"156""' 


PFANj name 


DESCRIPTION 


p- va 3 ue 


PFAK 
SCORE 


Jacalin 


oaca] in-like lectin domain 


C . 0 9 


-24 . 9 


16C 


Zn_carbOpept 


Zinc carocxypeptidcse 




471 . 9 


1 ^! 


pkina.se 


Eukaryor.ic protein kinase 
coma in 


|5.1c-67 

j 


236 .1 


It' 


zf -C3IIC4 


Zinc finger, C3HC4 type (RING 
finger ■ 


5.3e-C7 


27 .0 


16f: 


Ribosomal_Si 
5 


Ribosomal protein S15 


1. ie-06 


29-0 


169 


DEAD 


DEAD/ DEAH box helicace 


le-4 8 


197 . 0 


171 


DUF59 


Doxain of unknown function 
DUF5 9 


0.C7 


-17.4 


17: 


pkinase 


Eukaryotic protein kinase 
doma i n 


3.7e-lb 


b8 .6 


373 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32 . 9 


179 


ras 


Ras family 


le-31 


118.8 


17£ 


ATPlGl_PLM K 
AT 8 


ATP1G3 /PLM/MAT8 famiTy 


2.£e-l7 


71 . 0 


179 


2f-C2H2 


Zinc finger, C2H2 type 


1 .be-99 


344 . 2 


18C 


Clq 


Clq domain 


8.8e-72 


251 . 9 


19C 
191 


i_phosphata£ 
e 


Protein- tyrosine phosphatase 


4 . 9e-28 7 


967 . 0 


efhand 


EF~~hano 


7. be- 16 


66.1 


193 


p k i na c e 


Eukoryotic protein Kinase 
domain 


G . be- 82 


28b . 6 


1 94 


bi omodoma i n 


Br omodotna i n 


S fte- 31 


111.4 


199 


PALP 


Pyriccxal -phosphate dependent 
en zymp 


2.be-64 


227.1 


197 


DnaJ 


DnaJ domain 


1 .Ge-38 


141 . 4 


19 9 


R rn&AD 


dimethylases 


n nnm ft 


l ft s 
± o . y 


2 00 


t 


nioL jujuc oiiiu lj i il jj-s^j i id i. o ^ 


2 be - 1 0 


3 7.x 


201 


WH2 


Wi skot t Aldrich syndrome 
Homology region 2 


0 . 0004 8 


26.9 


2 04 


vATP- 
synt AC39 


ATP synthase (C/AC39) subur.it 


1 . 3e- 159 


543 . 7 


209 


vAT?- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1 .6e-139 


476 .9 


20C 


ldl_recept a 


Low-deneity lipoprotein 
receptor domain 


2 .4e-25 


97 .6 


209 


ank 


Ank repeat 


1 .4e-19 


78.4 


21C 


Rhomboid 


Rhomboid lamily 


0. 0039 


1.2 


211 


Clq 


Clq domain 


1 .6e-70 


247.7 


212 


U0_con 


Ubi qui tin- conjugating enzyme 


7.4C-74 


258.6 


213 


UQ_con 


Ubiguit in- conjugating enzyme 


le-53 


191 .9 


21b 


DEAD 


DEAD/DEAH box helicase 


l,8e-43 


140.4 


216 


PMP2 2_Clauai 
n 


PMP-22/EKP/MP20/Claudin family 


4 .be-21 


83.4 


2ie 


Glycos trans 
f_2 


Glycosyl transferases 


4e-21 


83 .6 


219 


ig 


Immunoglobulin domain 


0.092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat 


7 . 4e-23 


89.4 


22a 


TPR 


TPR Domain 


1.2e-06 


42.1 


22b 


DnaJ CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141 -S 


226 


Dr.aJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1 .Se-38 


141,5 ! 


229 


HSP70 


Ksp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 3 domain 


0.007S 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


4 92.0 


234 


ras 


Ras family 


4 .8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


'45.0 
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PEQ ID 
N'O: 


PFAM NAME ] DESCRJPTJON 
1 


p-vclue 


PFAK 1 
SCOKE 




dCMP_cyt_6ea 
m 


Cyticr.ne ana deoxycy t : dyl a t e 
deam.i riase 


2 . E>e- 05 


31 . 3 




ig J lmmunocrlohu 1 i n domain 


6 .7c- 08 


30. 1 


240 


wnt 


wnt family of developmental 
signaling protei 


97Te-270 


742 .6 


2S0 


mito_carr 


Mi tocnondrial carrier proteins | l.3e-55 


193 . 6 


?S4 


adenyi alexin 
ase 


Adenylate kinase 


1 . 8e 14 


55.7 1 

I 


25^ 


Cation ef f lu 

X 


Cation efflux family 


2 . 8e-33 


124 . G 


256 


SH3 


SH3 comam 


3 . 9e- 14 


60. 4 


257 


Aa_trans 


Transmenbrane amino acid 
transporter protein 


2.6e-S2 


1B7.2 


256 


adenyl atekin 
ase 


Adenylate kinase 


2 . le-110 


3 80.2 


2 b ^ 


HIT 


HIT family 


e.2e-C7 


25.3 


260 


Bacter iaI_PQ 
Q 


POO enzyme repeat 


i . 6e-15 


65.0 


26? 


proieasome 


Proteasome A - type and B- type | 6 . 5e-64 


225.'. 


267 


pkinase 


Eukaryotic protein kinase 
domain 


& . 3e-27 


101 . c 


27C 


filament 


Intermediate filament proteins 


3 . 2e-150 


512 . £ 


2 V JL 


Choline kina 
se 


Cholmc/cthanoiamine kinase 


2e-67 


237.4 


277 


Riboscmal_S7 


Riboscmal protein S7p/S5e 


3 . 3e-20 


80.6 


27S- 


pkinase 


Eukaryotic protein kinase 
domain 


j . 3e-77 


269. S 


28C 


WD4 0 


WD domain, G-beta repeat 


1 . Be-73 


255 .4 


281 


WD4 0 


WD domain, G-beta repeat 


1 . 8e-73 


255.4 


2 84 


zf-DHHC 


DHHC zinc finger domain 


4 . 6e-24 


93.4 


28 7 


Exonuclease 


Rxonucleast 


1 . 4e-67 


238 .0 


293 


SAM 


SAM domain (sterile alpha 
motif ) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif } 


0 . 034 


11 . 2 


294 


zf-C2H2 


Zmc finger, C2H2 type 


:.4e-29 


111 .7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 . 2e-125 


430.0 


296 


mito_carr 


Mitochondrial carrier proteins 


4 . le-59 


205. 5 


29 7 


KMG_box 


HMG (high mobility group) box 


e . 7e-29 


109.4 


302 


Glycoe trans 1 Glycosyl transferase 
*-« ~ 1 


c : e - 8 7 


302 . 5 


304 


tRNA-r,ynt_2 | tP.NA synthetases class II (D, K 
! and N) 


j . le-84 


294 .8 


305 


KRAB | KRAB box 


2e-44 


161 .0 


306 


rrra | RNA recognition motif. 


2 . 7e-44 


160 .6 


3Cfr 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


S.2e-39 


126 .1 


309 


DNA_polymera 
seX 


DNA polymerase X family 


2 . 4e-64 


227 .2 


31 j 


F-box 


F-box domain. 


9 . 5e-08 


39 . 2 


312 


ig 


Immunoglobulin aomain 


. 8e-19 


65.9 


313 


Ets 


Ets-domain 


e.le-60 


192 . 3 


31b 


Kelch 


Kelch motii 


1 .3e-l06 


367 . 6 


217 


arf 


ADP-rlbosylation factor family 


3.2e-3S 


130.4 


316 


sugar_tr 


Sugar (and other) transporter 


0. 0003 


-73 .1 


320 


pkinase 


Eukaryotic protein kinase 
doma i n 


8. le-83 


288 . 6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4. 9e-81 


2 82.6 


324 


Xlink 


Extracellular link domain 


4 . Se-143 


331.5 


326 


ARID 


ARID DNA binding domain 


S.le-37 


136.4 


327 


HMGbox 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin dona in 


6.ie-81 


281. S 


331 


chrorao 


•chromo' (CHRroroatin 
Organization Modifier) 


4e-18 


66.7 


331- 


Peptiaase M2 
2 


Glycoproteaee family 


1.2e-i36 


467.4 
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£EQ ID 
NO: 


pfam nam*. 


DESCRIPTION 


p- value 


SCORE J 


33S 


vwa 


von WiHebrand factor type 1. 
domain 


2 . 3e-07 


37. c" 1 

I 


339 j ras 


" Ras f amily 


7 .8e-07 


-59.1 : 


340 | 2f-C2Hl 


Zinc linger, C2H2 type 


8 .2e-64 


225 .< 


342 | 2f-C2H: 


Zinc finger, C2H2 type 


2 .4e-85 


2 97 .C 


343 




Immunoglobulin domain 


0 . 0005 


18.0 


346 


pkinas*- 


Eukaryotic protein kinase 
domain 


6 .5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


22 9 


351 


EGF 


EOF- like domain 


8.5e-20 


79. i 


352 


ank 


Ank repeat 


2 .5e-101 


350.0 


354 


TBC 


TBC domain 


S.le-lS 


63.2 i 


355 


PHD 


PHD- f inger 


3.2e-07 


37 . 4 


358 


DUF6 


Integral membrane protein DUF6 j 0.033 


15 . £ 


359 


zf -C2.h; 


Zinc finger, C2H2 type j 7.4e-20 


79.4 


361 


ank 


Ank repeat '6.6e-34 


126 . 1 


362 


Ar {Gap 


Putative GTP-ase activati.no 
protein for Arf 


4 . 7e - 53 


189.7 


363 


ef hand 


EF band 


5 4e- 10 


46 . C- 


367 


LRR 


Leucine Rich Repea'. 


8 8e - 44 


158 .5 


368 


laminin C 


Laminin G domain 


1 . 5e-33 


121 .7 


369 


FP2C 


Protein phosphatase 2C 


S . 3 e - 20 


73 . t 

• 


3 72 


LIN 


L1M domain containing proteins 


9 .9e-lS 


57. 1 


373 


KRAB 


JVKAtJ DOa 


A ft o _ O "i 


90. C . 


3 76 


i on t * ci i ) 5 


Ion transport protein 


2.9e-09 


-4 . V 


Oil 




Beige/BEACH domain 


** . i?e iuo 


704 . L 


3 80 


— 


Eukaryotic protein -cinase 




327 . !: 


3 61 


AKP- binding 


JiMP» hi nHS no *-»nwmi* 


1 . 4e-07 


-14 0.3 


3 82 


HECT 


HECT-domain lubicjuitin- 
transferase) . 


1 . 3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-101 


350 .0 


386 


1 c 


Immunoglobulin domain 


9 ,5e-05 


23 . 1 


388 


zf -C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154 .t 


389 


ig 


Immunoglobulin domain 


2 . Se-1 5 


■5 4 ;• 


390 


mito__carx 


Mitochondrial carrier proteins 


3.5e-67 


233 .2 


392 


TPR 


TPR Domain 


6.3e-17 


69. ', 


393 


SH3 


SH3 domain 


3.5e-09 


43 . 9 


394 


AAA 


ATPases associated with various 
cellular act 


4 .le-21 


83. € 


396 


spectra r- 


Spectrin repeat 


2.1e-67 


231 .3 


397 


zf -C2H2 


Zinc finger, C2H2 type 


0.0066 


23 . "j \ 


399 


£n3 


Fibronectin type III domain 


4 . le-102 


352.6 i 


400 


WD4 0 


WD domain, G-beta repeat 


0.O0049 


26 .* 


401 


L'l dehydrcc 


Dehydrogenase El component 


3e-119 


40S . £ 


402 


fn3 " 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


48 . 0 


405 


cadherm 


Cadherin domain 


8.ie-81 


281.? | 


406 


zt-CXXC 


CXXC zinc finger 


Se-15 


63.4 [ 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92 . :• i 


411 


F-box 


F-box domain. 


4.2e-06 


33.7 


412 


SNF2_N 


SNF2 and others N-terminaJ 
domain 


S.8e-16 


^61. t 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93. € 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207 . h 


420 


RasGEF 


RasGEF domain 


fi.le-43 


1S5.7 


421 


ank 


Ank repeat 


1.4e-lS3 


523.7 


424 


G- patch 


G- patch domain 


le-19 


78.? i 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117. j , 
24 .fc ~\ 


426 


Plexin_repea 
t 


Plexdn repeat 


0.0023 


24.6 \ 


4 27 


Plexin_repea 


Plexin repeat 


0.0023 
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SHO. ID 
NO: 


ffam name 


DKSCR3 PTION 


p- value 


PFAN ~~ 
SCORE 




t 








-42T 


zf -C3HC* 


Zinc finger, C3HC4 type (RING 
f inger) 


6 .Ge~ll 


3 9.2 


431 


DEAD 


DEAD/DEAH box hell case 


le-6fc 


214. C i 


43< 


SHI' 


SH3 domain 


3 . 4e-16 


6 7.2 | 


433 


GTP_CDC 


Cell division protean 


2 .le-114 


3S3.5 


436 


Collage: 


Collagen triple nelix repeat 
{20 copies) 


4 .6e-19< 


658.1 j 




RiCin_E_lect 
in 


Similarity to lectin domain of 
rioin b 


0 .0085 


10. 5 


441 


Aupha ac&ptx 
n C 


Alpha adaptin carboxyl - terminal 
domai 


1 ,2e-256 


866 .0 


442 


Alpha adapt i 
n *C 


Alpha adaptin carooxyl- terminal 
domai 


1.8e-235 


755.7 


443 


PDZ 


PDZ domain (Also Known as DHR 
or GLGF) . 


1 . 9e-65 


230. S 




LON ~ 


ATP-dependent protease La <LON) 
domain 


0.O0012 


-17.1 


446 


3 r 


Immunoglobulin domair. 


0.00011 


2u .:. 


,451 


sushi 


Sushi domain (SCR repeat; 


1.4e-18 


75.2 


4S2 


£n3 


Fibronectm type 111 domain 


1 .5e-06 


35.2 


454 


pyri doxalde 
C 


Pyridoxal - dependent 
decarboxylase const 


8.3e-14 


50.3 


456 


kir.esin 


Kinesin motor domain 


4 .9e~217 


734.4 


457 


neur_oh<*r. 


Neurotransmi t ter-gated ion- 
channel 


le-175 


597.1 


458 


Jcsephin 


Joeephin 


0.0002 


18.7 


468 


bZJP 


bZIP transcription factor 


1 .7e~07 


31 .e 


470 


NTP_txonsf er 
ase 


Nucleotidyl transferase 


6 .3e-06 


-V6.3 


471 


WD4 C- 


WD domain, G-beta repeat 


2e-28 


107.9 


473 




L1M domain containing proteins 


0.00021 


20.7 


4 7'/ 


z f - RanBF 


Zn- finger in Kan binding 
protein and others. 


0.028 


21 . 0 


473 


WD4C 


WD domain, G-beta repeat 


6.5e-18 


73 .C 


480 


KRAE 


KRAB box 


le-31 


lie . e 


! 481 

r 


Art Gap 


Putative GTP-ase activating 
protein for Art 


8 .4e~66 


232.0 


j 485 


sh; 


Src homology domain 2 


0.011 


11.4 


486 


Cic 


Clq domain 


4 .3e-74 


259. 6 


487 


asm. 


Double- stranded RNA binding 
mot if 


l.le-47 


171 .9 


489 


zf -C2H2 


Zinc finger, C2K2 type 


4.8e-l53 


521 . 9 


490 


Alpha_ad:ipt i 
n_C 


Alpha adaptin carboxyl- terminal 
domai 


3.4e-222 


751 . 6 


492 


SKI 


Shikimate kinase 


1 .2e-10 


48 . 8 


497 


ENVjpolyprot 
ein 


ENV polyprot ein (coa: 
polyprotein} 


2.6e-22 


77 . 6 


498 


abhydrol aae w 


Phospholipase /Carboxyl est erase 


0.041 


- 4 8 . 1 


500 


rn 


RNA recognition mot it . 


5.4e-34 


126 .4 


501 


WW 


WW domain 


4 ,6e-l8 


73 . 4 


502 


io 


Immunoglobulin domain 


l.le-10 


35.5 


504 


abhydrol ase 


alpha /beta hydrolase fold 


0.045 


-3 . 6 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na_K_ATPase_ 
C 


Na4/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuclease 


Exonuclease 


1.3e-56 


201 .5 


510 


Glycos trans 
f_j 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
£3 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_3 


Glycosyl transferases group l 


1.96-09 


38.5 


514 


pro isomer as 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


1.8e-63 


222 .4 
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S*EQ ID 

NO: 

~S3_T ! 


PFAK NAME 


DESCRIPTION* 


p-vaiue 


PFAN> 
SCORE 


KG!" 


EGF-like domgi;:-. 


1 . 9e.-16 


74 -7 


51t 

52? 


Surf; 


Surp module 


4 .3e- 3 8 


140 . 0 




Immunogl cbu] in domain 


3 .3e-06 


] 25 .0 


52 1 




\JBX domain 


1 -le-34 


126 .6 


526 ! adhzinc 


Zmc-binduiu aebycrogenases 


2.7e-3 4 


127.4 


53C - 


SAK 


SAM domain (Sterile alpha 
not if) 


0.046 


10.0 


531 


adh short 


short chain dehydrogenase 


0-0025 


-34.1 


532 


mlto carr 


Mitochondrial carrier proteins 


2 . Se-81 


281 . 7 


S3 3 


mito_ carr 


Mitochondrial carrier proteins 


2e-6J 


213 . 5 




th i Dl 9S6 


Thiolase 


3 .be-183 


622.0 


J J — 


CMn. ") ■> It m. 
r rlKj a J AC: 


Flavin^bindma monooxygenase- 
like 


0 


1153 .7 


536 


SCAN 


SCAN domain 


4e- 51 


196 . 6 


53 7 
~536"" 


tRNA-synt_2 


tRNA synthetases daae I (1, L>, 
M and V) 


3 . le-136 


466 .0 


tRNA-synt_l 


tRNA synthetases class I (I, L», 
M and V) 


3 . le-136 


466 .0 


53? 


tRWA-synt_l 


tRNA synthetases class I [1 , I*, 
M and V) 


1 . 9e- 117 


403 . 6 


540 


tRNA-synt_l 


tRNA cynthet ases class I {1 , L», 
M and V) 


3 . le-136 


466.0 


543 


vAT?- synt_S 


ATP synthase (E/31 kDa) suounit 


5 . 9e- 85 


295 . 7 


543 


zf -C2H2 


Zinc finger, C2H2 type 


5.5e-6"9 


242.6 


544 


DUF101 


dufio: 


8 . 5e - 3 8 


13 9.0 


545 


TGFb_propept 
ide 


j ur -oeta propcp_icie 


i . ± tr- o / 


£ J o . 4 


547 


WD4 0 




2 . 6e- 32 


120.8 


54H 


Run 


Rel homology domain (RHD) . 


. 1 .6e-238 


&B6 .2 


549 < MMR_HSRl 


GTPase of. udk novn function 




236 0 


553 I HECT 


HECT-comain (ubicjuitin- 

t Itfllol Close/ • 


a ^ in 
't . JC"i^ / 


435 6 


5S4 


MHC_II_alpha 


antigen, alp 


3 . be- 74 


259.6 


555 


zf-UBRl 


Ditf" at i v»o v i nr f i nnpf "in N - 
ruio ve ^j.jjv* j. a i i y c j. j. ii 

recognin 


3 . 3e- 1 i 


67 . 3 


556 


Kelch 


Kelch motif 


5 . 5 e - 2 S 


109-7 


56i 


AMP-binding 


AMP-bi ndmcj enzyifle 


2 . 6e-06 


-163 . 7 


562 


PABP 


Poly- aaenylat e binding protein, 
unique domai 


4 . 9e-38 


139 . 8 


564 


Gag_p3 0 


Gag P30 core shell protein 


1 . 2e-67 


236 .2 


560 


PWWP 


PWWP domain 


8. le-lfc 


66 .0 


567 


SCAN 


SCAN domain 


7.3e-6& 


238. S 


56 9 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


570 


pkmase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CNhydrclase 


Carbon- nitrogen hydrolase 


0.O0081 


-79.7 


572 


myosinhead 


Myosin head {motor domain) 


0 


1495.2 


573 


myosmhead 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1.7e-23 


91.5 


576 


Surp 


Surp module 


}.7e-23 


91.5 


577 


DNAjpolB 


DMA polymerase family B 


0 


1138.6 


578 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


6 .3e-05 


42 .7 


579 


LRR 


Leucine Rich Repeat 


4.Se-21 


83.3 


580 




Neurotransmi tter-gated ion- 
channel 


5.9e-l77 


601.3 


583 


sushi 


Sushi domain (SCR repeat} 


0 


1673.0 


584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH-domain 


KH domain 


2.9e-l3 


57.5 


587 


G- patch 


G -patch domain 


2.3e-14 


61 .2 


58S 


LIK 


LIM domain containing proteins 


2.3e-36 


133-4 


(""590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114-7 
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SEO fv 

NC : 


PFAM NAME 


DESCRIPTION 


p- Vr. lUt 


PFAM 
SCORE 


"59":-" 

594 


honticne^rec 


Ligand-banc^ng ccn-am of 
nuclear hormone 


3.5c. 21 


97 . j 


PHD 


PHD- tinge i 


3 . ; - 1 : 


53 .8 


caoherin 


Cadherin domain 


4 .2c -99 


342 .7 


59t 


pKinase 


Eukaryotic protf.an kinase 
domain 


5e-V, 


319 .2 


597 
600" 


WD4 0 


WD contain, G-beta repeat 


0. 00C54 


26 .7 


FG-GAP 


FG-GAP repeat 


4 .3e-75 


262.9 


602 


G_Adapt_CT 


Gamma- adapt in, C- terminus 


1 .le-5i 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 3e-B6 


300.4 


SOi 


Col lagen 


Collagen triple helix repeat 
(20 copies} 


8e-4: 


152 .4 


soc 


mi to car r 


Mitochondrial ce.rner proteins 


6.3e-6; 


232.3 • 


608 


PKWF 


PWWP domair. 


2.6e-28 


107.5 


60S 


PWWP 


PWWP domair. 


2 . 6e-2e 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0. 0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA-bindmo dona in 


5. 2e-54 


192.9 


616 


k:nesm 


Kincsm motor cnrnair. 


1 . le-83 


284 .8 


61-7 


k- nesin 


Kinesin motor comair. 


8 . 4e- eu 


27B.5 


6ie 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 . 009fc 


13 .1 


62C 


MATH 


MATH domair. 


7 . Be-CS 


22 .2 


623 


Y_jiho£ipha t as 
e 


Protein- tyrosine phosphatase 


1 . 4e-32 


121 .6 


622 


pkmaoe 


Eukaryotic protein kinase 
dotna j n 


4 . 4 r - 4 C 


146 .6 


62 3 


BNR 


BNR repeat 


2. le-13 


51.3 


624 


mclybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1 . 4r-l^ 


42.2 


62b 


TPR 


TPR Domain 


1 . le-17 


72.2 


627 


cNMP_binding 


Cyclic nucieotide-bindmg 
domain 


3 . 7e-Sfc 


206.6 


63G 


adhshort 


short chain dehydrogenase 


5e - 3 7 


70.0 


633 


zf -C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307.1 


-6T2 


rrm 


RNA recognition motif. 


4e-0L 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 6e 104 


360.7 


636 


Fork_head 


Fork head domain 


5.9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 .8e-70 


246.5 


642 


TPR 


TPR Domain 


4 . 8e-08 


40 . 1 


643 


ef hand 


EF hand 


1.9e-27 


104 .6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1 .2e-101 


351.1 


64B 


PseudoU synt 
h 2 


RNA poeudour idyl ate synthase j 1.9e-55 


197.6 


650 


zl-C2H2 


Zinc finger, C2H2 type 


0 . 0087 


22.7 


653 


ank 


Ank. repeat 


1 .3e 17 


1l. $ 


652 


I_LWEQ 


I/LWEQ domain 


9 .5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4 .le-171 


581.8 


654 


tspi 


Thrombospondm type l domain 


4 .le-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N- terminal to 
homecbox domain 


5 .3e-45 


162.9 


662 


C2 


C2 domain 


6 .7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76.2 


664 


C2 


C2 domain 


6 .7e-19 


76.2 


667 


GST 


Glutathione S- transferases . 


9.3e-34 


114.4 


666 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-5~/ 


203.2 


673 




I/LWEQ domain 


9.5e-101 


341 .0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD4C 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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SEO ID "' 
KG; 


PFAM NAME 


DESCRI PTIOK 


p- Vfjjue 


PFAM 
SCORE 


67b 


WD4 0 


WD domain, G-beta repeat 


4 . 86-24 


93 . :« 


6"yfc 


LKK 


leucine Rich Repeat 


0 . O0IS 


25 . 2 


~T?9 


e?- CCCH 


Zinc finger C-x8-C-x5--C-x3--K 
type 


2 6e - 29 


107.7 


680 


z f - C2K2 


Zinc finaei, C2H2 type 


5 . 2e- 05 


30 . 1 


se: 


Ch 


CaMponin homology (CH) domain 


2 . 4c- 17 


71 .1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4 . 3e- 43 


156.6 


683 


zf - C3HC4 


Zinc finger, C3HC4 type (RING 
f i nger ) 


0 . 0b j 


10 . 8 


687 


Synapsin 


Synapsis 


0 


1 o nn r* i 

18 90.8 | 




PRSb 


Protem pnosphatase 2A 
regulatory subunit PR 


0 


1038 . 8 j 


6 SI 


hemeobox 


liomeobox domain 


8 . be- 30 


112.4 


656 


Pept idase_M2 
4 


metallopept idase family M24 


2 . Be- 59 


210.5 


697 


RhoGEF 


RhcGEF domain 


9 . be- 35 


12 8.9 


696 


PHD 


PHD- f mqer 


0.006 


9.3 


701 


zf -C2H2 


Zinc fmoer, C2H2 type 


5 . 5e-123 


422 . 0 


702 


Sul f atase 


Sultatase 


3e- 231 


781 . C 


703 


?.t -C2H2 


Zinc finger, C2H2 type 


5 . 7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


1 . le-22 


88 . e 


706 


WD4 0 


WD domain, G-beta repeat 


4 .8e-lS 


76 . 7 


710 


Ran_BPl 


RanBPl oomam. 


8 .4C-06 


-7.3 


71? 


DEAD 


DKAD/DEAH box helicase 


9 . 9e- 42 


134 .9 


714 


PK 


PH domain 


1 .6e- 09 


39.0 


71 5 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 .5e-37 


138 .2 


717 


Sialyltransf 


S i~al ylt ran sf erase family 


7 .5G-31 


115.9 


71 £ 




3 nmvjnoglobul in domain 


le-2& 


100.8 


719 


int egrin_B 


Integrms, beta chain 


0 


1125.4 


720 


zf -C3HC4 


2;nc finger, C3HC4 Lype (RING 
f inger ) 


1 . le-08 


32.4 


72 2 


Pept 3dase_C2 


Calpain family cysteine 
protease 


3e- 145 


4 95.9 


72 > 


ic 


Immunoglobulin domain 


2 .2C-05 


22 . 4 


72 6 


F-box 


F box domain. 


0 .007 


23- C 


72 5 


Nop 


Putative snoRNA bmdang domain 


8 . ie-58 


205 . 5 


72* 


Nop 


Putative snoKNA binding domain 


6 . ie-SS 


205.5 


72 7 


WD4 0 


WD domain. G-beta repeat 


7.5e-26 


99.3 


73C 


dsrm 


Double- et randed RNA binding 
motif 


0 . 027 


12.1 


73 J 


dyr.amin 


Dynamin family 


4 . 2e-l6 


66 . 9 


732 


zf -CCCH 


Zinc finger C-x8-C-x5~C-x3-H 

type 


2 . 8e-10 

■ 


41.7 


73 *> 


CDP- 

0H P t reins f 


CDp- alcohol 

phosphatidyl t rar.sf erase 


4 . 2e - 26 


100.1 


73 8 


DEAD 


DEAD/DEAH box helicaee 


8 . 6e - 57 


182.5 


73? 


TSC2 2 


TSC-22/dip/bun family 


6 . 5e-32 


119.5 


74 2 


ras 


Ras family 


2 . 2e- 100 


346.9 


743 


PMl_typel 


Phosphomannosc isomerase type J 


1 .2e-243 


822 .9 


747 


trypsin 


Trypsin 


6 . 4e- 88 


279.4 


74fr 


kazaT 


Kazal-type serine protease 
inhibitor domain 


2 . 2e - 52 


187.4 


74.9 


einand 


EF hand 


6 . 3e - 06 


33 . 1 


75} 


PHD 


PKD- finger 


4 . Se-16 


66 . 7 


757 


zf~ C2K2 


Liiioei , type 


3 2e - 2 1 


83 9 


7s; 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 


Ribosomal_L3 
9 


Ribosomal L39 protein 


0.00018 


26.7 


755 


PH 


PH domain 


3.6e-14 


55.7 


Ibl 


SCAN 


SCAN domain 


1.4e-53 


191.5 


75£ 


PA 


PA domain 


0.0065 


23.1 


760 


arf 


ADP-ribosylation factor family 


2.2e-Z 9 


77.6 


763 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 ! 
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SEQ ID , 
NO: 


FFAM NAttt 


DESCRIPTION 


p- value 


PEAK 
SCORE 


76: 


hist one 


'Core hist one H2 A/K2B/H3 /K4 


9 . 9e-s:- 


188 . 6 


'761' 


zf-MYND 


HYND finger 


4 . le-14 


60.3 


70 4 


pou 


"pou domain - K- terminal Lc 
home obex domain 


ie- 52 


188 .6 


76 7 


vwc 


von Willebrand laceor type- C 
domain 


2 . Se- 34 


127.3 


765 


ef hand 


EF banc 


4 . 8e-ll 


50.1 


77C 


zf -C4 


Zinc finger, C4 type Itwc 
domains) 


2 -4e-53 


181.6 


772 


ras 


Ras family 


7e - 90 


312.0 


773 


Sulf atase 


Sul f atase 


le- 142 


4 87.5 


775 


zf-C2H2 


Zinc finger, C2K2 type 


1 . le-12 


55.5 


7 7e 


zf -C2H2 


Zinc finger, C2H2 type 


1 . le-12 


55 .5 


111 


zf-C2H2 


Zinc finger, C2H2 type 


l . le-12 


55 . 5 


77* 


rrm 


RNA recognition motif. 


I . le-32 


121 . 1 


i 7? o 


G6PD 


Gi u cose - 6 - phospha t e 
dehydrogenase 


l . be-76 


236 .6 


: 78C 


spectrin 


Spectrin repeat 


3 . 7e-2S 


110.3 


781 


mi to_carr 


Mitochondrial carrier proteins 


r 4 . 6e-57 


198 .5 


782 


SCAN 


SCAN domain 


' 1 . 3e-24 


95.2 


7 81* 


PD2 


PDZ domain (Also known as ;*>Hfc 
or GLGF) . 


4 . le-07 


37.1 


76b 


DEAD 


DEAD/DEAH box he la case 


f.e-06 


21 .7 




ras 


Ras family 


5 .3e-39 


143.0 


787 


RNaBe_HII 


Ribonuclcasc UH 


! 2 . be-67 


237 . 1 


79C 


PI3_PI4_k3na 
se 


Phospha t idyl i ncr.it ol 3- and 4- 
kinases 


1 5 4e-106 


372.2 


75S 


cadherin 


Cadherin domain 


2 . be-40 


147.4 


79fc 


ARID 


ARID DNA binding domain 


3 . 6e-20 


81.6 


797 


trypsin 


Trypsi r: 


9 .9e-20 


64 . 8 


79f 


CH 


Colponin homology (CH) domain 


3 .7e-15 


63 . 8 


803 


Gal- 
bind lectin 


Vertebrate galact oside-binding 
lectin 


4 . le-2b 


88 . 7 


so:. 


WD4 0 


WD domain, G-beta repeat 


0 .00082 


26.1 


80C 


TBC 


TBC domain 


1 . bc-26 


101. 4 


807 


TBC 


TBC domain 


1 . 8e-26 


101 .4 


806 


CN hydrolase 


Carbon-nitrogen hydrolase 


8 . 8e-80 


278.5 


823 


CBFD NFYB HM 
F 


Histone-Jixe transcription 
factor 


6e- 14 


59.8 


ei: 


adh_short 


short chain dehydrogenase 


8 . le-20 


79.3 


814 


IMP4 


Domain of unknown function 


3 .3e-71 


250.0 


81b 


zf -C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


Bit 


?ept_tRNA_hy 
dro 


Pept i dyl - tRNA hydrolase 


1 .6e-37 


338.0 


817 


ARID 


ARID DNA binding domain 


2 . be-18 


74.3 


826 


IF5_eIF4 elF 
2 


el F4 -gamma/el F5/e I F2 - epsi 1 on 


1 .6e-32 


321 .5 


83 f 


Arf Gap 


Putative GTP-ase activating 
protein for Arf 


1 .5e-53 


191 . 3 


83 3 


LRR 


Leucine Rich Repeat 


2 . le-26 


101 . 1 


B32 


laminin_£GF 


Lamimn EGF-lixe (Domains III 
and V) 


2e-57 


204 .2 


83 5- 


rrm 


RNA recognition motif. 


1 .3e-22 


88 . 5 


840 


Y_phosphatas 
e 


Protein- tyrocinc phospbatane 


2.6e-119 


4C9?8 


84 3 


pkinase 


Eukaryotic protein kinase 
doma i n 


3 .4e-100 


346 .3 


844 


Ribosomal L2 
2e 


Ribosomal L22e protein family 


le-64 


228 .4 


84C 


IBR 


IBR domain 


9e-15 


62.5 


04 5 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


85C 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 .00016 


18.9 


853 


SET 


SET domain 


5e-30 


113 .2 


852 


SRCR 


Scavenger receptor cysteine - 


C 


1025.4 
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SEQ ID 

NO: 


PF AM NAME [ DESCRIPTION 


p- value 


PFAtt 1 
SCORE ! 






rich domair. 




1 


653 


SRCR 1 


Scavenger receptor cysteine 
rich domain 


D 


102b.', ; 


857 


idctamase_f: 


Metal lo-beta- lactamase 
super f ami ly 


0.012 


-6 . 0 

i 


9S8 


COX6A 


Cytochrome c oxidase s:ubunit 
Via 


3 .4e-58 


206.7 | 

1 


BSS 


rrm 


RNA recognition motii . 


5. 4e-45 


162.9 


861 


?RK 


Phosphor ibui ok i nase 


S . le- 62 


219.4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.5e-53 


185 .5 


364 


HSP90 


Hsp90 protein 


4 . 7e- 158 


538 . 5 


866 


la 


Immunoglobulin domain 


4e -1 2 


44 . j 


867 


zf-C2H2 


Zinc tinyer, C2H2 type 


7e-135 


461 .5 


872 


histone 


Core histone H2A/H2B/H3/H4 


4 . 9e-41 


14 9 . £ 


874 


CpSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSaee) 


2 . le-218 


73 9. C | 


879 


Ribosomal SI 
2e 


Ribosomal protein S12c 


2.1e-98 


340.3 


882 " 


serpin 


Serpins (serine protease 
inhibitors) 


2 . be - 4 2 


145,7 


883 


Patatin 


Patatin 


1 . 2e - SI 


182 . C 


884 


RA 


Ras association (RalGDS/ AF- 6 i 
domain 


0. 044 


8 .0 


867 


DUF92 


integral membrane protein DUF92 


2. 7e-l2 


b4 . 3 


889 


3ugar_t r 


Sucjar (and other) transporter 


8. 2e-63 


222 . : 


893 


DUF28 


Domain of unknown function 
DUF26 


1. 3e-43 


158 .3 


896 


IP_trans 


Phosphatidyl inositol transfer 
protein 


6.5e-98 


338.7 | 
1 


898 


DEAD 


DEAD/ DE AH box hell case 


1 . 5e-48 


1S6 .1 


899 


KE2 


KE2 family protein 


7e-6l 


215.7 


900 


KE2 


KE2 family protein 


4 .3e-51 


183 .2 


901 


zf -C2H2 


Zinc finger, C2H2 type 


2. 7e-57 


203. e 


902 


ras 


Ras family 


2. 3e-7S 


263. b j 
-_ ? -t 


904 


TPR 


TPR Domain 


3 . 2e-22 


87.2 


906 


GEP 


Guaxiylate- binding protein 


8.9e-253 


B53.1 


907 


GBP 


Guanylate-binding protein 


1 . le-239 


809. t : 


908 


WD4 0 


wn domain, G-beta repeat 


2 . 6e-26 


100. f 


909 


Ph 


PH domain 


1 .3e-09 


39 . A 


910 


zf -C2H2 


Zinc finger, C2H2 type 


2 . Sc-39 


144.1 


913 


Ep unerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


- 88 . 5 


921 


TBC 


TBC domain 


I . be- 09 


3 0.7 


922 


WD4 0 


WD domain, G-beta repeat 


1.6e-25 


9B.5 


923 


WD4 0 


WD domain, G-beta repea; 


8 . 2e-C7 


36.3 


924 


Hydrolase 


haloacid dehalogenase -.: ike 
hydrolase 


2.9e-G5 


29.1 


925 


UQ con 


Ubiguitin-conjugat ing enzyme 


0 . 00033 


-27.* 1 


926 


CH 


Calponin homology (CH) aomain 


3.3e-53 


190 .2 


92B 


WD4C 


WD domain, G-beta repeat 


5 . 9e-46 


172.7 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 . le-10 


37 . 4 


930 


Ribul_P_3_ep 
im 


Ribulose- phosphate 3 opitnerase 
family 


7.2e-105 


361 .e 


93 1 


Kli^ui._t , 3 ep 
am 


Ribulose-phosphate 3 epimerase 
family 


1 . 2e- 96 


334 .4 


93 6 


C2 


C2 domain 


2 2 e - 6 2 


220.7 


937 


NAP_fair,ily 


Nucleosome assenbly protein 
(NAP) 


1 . le-22 


84 . 6 


940 


abhydrolaee 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3.2e-07 


25.1 


948 


pkmase 


Eukaryotic protein kinase 
domain 


3.4e-75 


263.2 


94 9 


WD40 


WD domain, G-beta repeat 


1 . 8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyltransf erase 


1.6e-07 


38.4 
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SEQ ID 
NO: 


PFAM NAMf. 


DESCRIPTION 


p-vai ue 


PFAM 1 
SCORE \ 


SSI 


SAM 


SAM domain ISterile alpiu 
mot i f ) 


0 . 014 


14 " A 


9SA 


GFO_lDH_MocA 


Oxidoreductase family 


1 .3e-ll 


52 . (■ 


95E 


BTB 


BTB/POZ domain 


7e-22 


86.3 


956 


BTB 


3TE/POZ domain 


7e-22 


86 . . 


957 


CDP- 

OH_P_transf 


CDF- a) cohol 

phosphatidyl transferase 


0.O53 


-22.: 


959 


ras 


Ras "family 


2 .4e- 97 


336 .1 


960 


ras 


Ras family 


8,4e-43 


155. t 


961 


Acetyltransf 


Acetyltransterase (GNAT) family 


1 .2e-08 


42 . : 


962 


adh short 


short chain dehydrogenase 


2.4e-3l 


117 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26 


965 


IP-2B 


Initiation factor 2 subunit 
family 


8 .4e-l93 


653 .<- 


97G 


RNase_PH 


3' exoribor.uclease family 


9e-24 


92 . 4 




WW 


WW domain 


5 .7e-2S 


9G .4 


977 


PDZ 


PDZ "domain (Aiso known as DHR 
or GLGF) . 


3 .6e-21 


83 .7 


976 


Ribosomal_Ll 
7 


Ribosomal protein hi 7 


2 .4e~20 


81 . ( 


979 


LIK 


LIM domain containing proteins 


5 .8c 42 


152 .1 


980 


Calsequestri 
n 


Calscquestrin 


1 . 7e-297 


1001 .7 


982 


HSP20 


Hsp20/alpha crystallin family 


1 .2e-30 


43.; 


983 


oxidored_q6 


NADH ubiquinone oxiooreductase, 
20 Kd sub 


4 .8e-63 


222": &" "" 


988 


TBC 


TBC domain 


2 .2e-50 


180.1 


989 


TBC 


TBC domain 


2 . 2e-50 


160. f- 


993 


tRNA int end 
o 


tRNA intron endonuci ease 


0 .0017 


-34 .; 


994 


honcobo> 


Homeobox domain 


4e-18 


73 . t 


997 


pyr_ redox 


Pyridine nucleot ide-rii sulphide 
oxidoreducta 


0 .012 


11 . t 


1000 


mi tocan 


Mitochondrial carrier proteins 


9.7e^i23 


4 21 .> 


1001 


RA 


Ras association (RaJGDS/AF- 6 ) 
domain 


1 .2e-15 


65.4 


1004 


DUF81 


Domain of unknown function 
DUF81 


0.099 


10.2 


100b 


actin 


Actin 


1 .3e-l74 


574 .2 


1006 


actin 


Actin 


3 . le-130 


428 .1 


1007 


cpn6 0_TCPa 


TCP-l/cpn60 chaperonin family 


3 .7e-19S 


661 


1008 


TPR 


TPR Domain 


8 . le-44 


159.0 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216 A 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216 A 


1012 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.7e-lS 


53. 3 


1016 


tRNA-synt_2c 


tRNA synthetases class 11 (A) 


2 .3e~15 


55.2 


101B 


RhoGAP 


RhoGAP domain 


1 .6e-78 


274 .3 


1022 


PGAM 


Phosphoglycerate mutase family 


3 .8e-18 


69.7 


1026 


HMGbox 


HMG (high mobility group) box 


8 .4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162. t 


1028 


UQcon 


Ubiquit in- conjugating enzyme 


1 .4e-49 


178.3 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0.028 


16\ 3 


1034 


Hydrolase 


haloacid dehalogenaoe- like 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4.8e-Q6 


32.4 


1038 


Cation_ef f lu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD-.arginine ADF- 
r i bo syl t rans f erase 


4.7e-47 


169.: 


1042 


WD4 0 


WD domain, G-beta repeat 


1.9e-l8 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93.7 


1045 


lectin_c 


Lectin C- type domain 


l.Se-28 


108.0 


1046 


Glucosamine^ 
iso 


Glucosamine- 6 -phosphate 
i somerase 


0.00013 


-25.3 
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SEC ID 
NO: 


PFAM NAME. 


DESCRIPTION 


p- value 


PFAK ! 
SCORt i 


1047 


1 lease -CO*-. 


CoA-llocsef 


4 .Se-8C 


"27 9". 


1049 


ic 


Immunoglobulin coma in 


1 .7e-0S 




10SG 


Ribosomal_L,2 
4p 


Ribosomai protein L24* 


2e-33 


124 .1 


1054 


Amidase 


Amidase 


4 .3e-lS2 


516 . 


10S5 


rrm 


RNA recognition motii . 


3 .8e-26 


100 . 


1058 


annexin 


Annex! n 


6 ,9e-44 


159 .: 


105S 


PMP22_CIaucTi 
n 


PMP-22/EKpyNP20/Claudin family 


0.023 


-23 


10GO 


ho me ob ox 


Homeobox domau; 


3.2e-3: 


1 1 7 . > 


1062 


Acyltransfer 
ase 


Acyl transferase 


0.00065 


10. 1 


1064 


AMP- binding 


AMP-blnding enzyme 


6 .6e-l00 


345.? 


106S 


LRR 


Leucine Rich Repeat 


3 .3e-14 


60. e 


1066 


G7P1 OBG 


GTP1/OBG t air.il y 


4 .8e-41 


141 .t 


1071 


ic 


Immunoglobulin oomain 


8.4e-48 


159. "J 


1072 


FHC 


PHD- finger 


6.8e-C7 


|36.2 


1074 


DENN 


DEWN domain 


8.3C-31- 


121 . b 


1075 


SCP 


SCP-lTke extracellular protein 


4 . ve-4: 


149. b 


1077 


OLF 


Olf actomcdm-) ixe dcrnaix. 


2 .2c~6€ 


234 .C 


1078 


mito_carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6.2e-4S 


162.'' 


1007 


START 


START domain 


1 .5e-4ti 


1 74 . 7 


1093 


DSPC 


Dual specificity phosphatase, 
catalytic dom*. 


3,3e-63 


223 .4 


109-5 


GSHPx 


Glutathione peroxidase? 


9. 6e-41 


148 .e 


1095 


DUF2 = 


Domain of unknown function 
DUF25 


2e-75 


264 .0 


1096 


PUF25 


Domain of unknown function 
DUF25 


6e-75 


262 .4 


1105 


Ni troreducta 
fie 


Nitroreductase family 


1 .3e-l3 


58. C 


1106 


FIX 


Phosphodiesterase family 


3 .3e-l75 


610.1 I 


1107 


DAGKc 


Diacylglyccrol kinase catalytic 
domain 


0.0004S 


19. f 


1109 


ras 


Ras family 


1 .3e -IS 


40.7 


1115 


ArfGap ' 


Putative GTP-ase activating 
protein for Axf 


9.7e-47 


168.7 


1116 


HMG14_17 


KMG14 and HMG1'* 


4 .4e-23 


83. h 


1117 


HMG14_17 


11MG14 and HMG1 7 


9.9e-12 


52.4 


1119 


FAA_hydrolas 
e 


Fumarylacetoacet ate (FAA) 
hydrolase fam 


2e- 83 


290.6 


1120 


pxinose 


Eukaryotic protein kinase 
domain 


1 ,4e~94 


327.6 


1123 


abhydrolase 


alpha/beta hydrolase fold 


9.2e-23 


6 9 . (' 


1129 


pro_ isomer as 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197.3 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114 .9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78 . 6 


1133 


WD4C 


WD domain, G- bets repeat 


1 .Be-15 


64 .9 


1134 


PK 


PH domain 


0.0015 


17.8 


1136 


Acap comp su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708. S 


113 9 


ras 


Ras famiiy 


l.Se-86 


301 .0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-0S 


29.9 


1153 


IRS 


PTB domain URS-l - type) 


5.4e-55 


196.1 


115S 


i2 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 


Asparaginase 

_2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductasef 


4.7e-142 


435.3 




zf-ANl 


ANl-like Zinc finger 


0.00021 


27.9 
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FSEQ ID 
NO. 


PFAM NAME 


DESCRIPTION 


p- valuf [ PFAM 
1 SCORE 


me:* 


) inker nisto 
ne 


linker nistone Hi and H5 family 


3.8e-14 


60 . 4 

"30^ 5 


1164 

"liTi 


DEC 


Death effector ccmam 


3.9e-6r 


IRS 


PTE domain URS-1 type) 


2.6e-43 


157 .3 


a 16( 


IRS 


PTB domain (IRS-l type} 


2.6e-43 | 157.3 


~Il6t 


SAM 


SAM domain (Sterile alpha 
mot if) 


0. 04 j 10.5 

! 


1 17C 


sbhydrolase 


alpha/beta hydrolase fold 


0. C96 i -7 . 5 


1174 


SAP 


SAP domain 


3.9e-lC 


47 . 1 


1177 


PP2C 


Protein phosphatase 2C 


5. 3e-3i 


112 .5 


ll7f 


WD4 0 


WD domain, G-beta repeat 


4. 7e-3S 


129 . 9 


use 


Ets 


Ets -domain 


1.8e-09 


33 .3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies} 


0. 00016 


24 .7 


1162 


TCL1_MTCP1 


TCL1/MTCP1 lamily 


9.5e-56 


198 .6 


1164 


RssGEF 


RasGEF domain 


1. 7e-8 8 


307 .4 


128b 


mi to carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR_ LY6 


u-PAR/Ly-6 domain 


0. 0042 


15 .6 


1186 


Orn_DAP_Arg_ 
dec 


Pyridoxal- dependent 
decarboxylase 


6.2e-12c 


430.6 


119? 


Stathmin 


Stathmin family 


1. 8e-9C 


314 .0 


1194 


Stathmin 


Stathmin family 


1.0e-90 


314 . 0 


1155 


Seel 


Seel family 


3. 2e-183 


622.1 


use 


pyr_redox 


Pyridine nucleotide- disulphide 
oxidoreducta 


3. le-32 


111.6 


1197 


Gi yco_ trans f 
6 


Glycosyi transferase family 8 


1.2e-0S 


45.5 


1202 


K_tRtra 


K+ channel te t ramer i sat ion 
domain 


0. 022 


-16.8 




adh_short 


short ehain dehydrogenase 


8.3e-4S 


162.3 


1206 


Ubie methylt 
ran 


ubiE/COOS methyltransf erase 
family 


1.3e-12l 


417.4 


1206 


7tm_3 


7 transmembrane receptor 


7.2C-09 


29 .0 


120S 


ank 


Ank repeal 


3.9e-lS 


63 .7 


121C 


vATP- 
synt_AC3 9 


ATP synthase (C/AC3 9) subunit 


2.be-12S 


439 .7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


S.5e-17 


69 .9 


1213 


efhand 


EF hand 


3 . 2e-07 


37 .4 


1219 


rrro 


rna recognition motif. 


2.1e-40 


147 .7 


122 0 


DUF6 


Integral membrane protein DUF6 


o.ois 


21.5 


1222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


122 3 


G- gamma 


GGL domain 


3.6e-36 


129 .5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


p>: 


PX domaii: 


2.2e-lS 


64 .5 


1233 


PX 


PX domain 


2.2e-15 


64 .5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-05 


44 .0 


1241 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2e- 63 


224 . 1 


1243 


WW 


WW domain 


0. 044 


17.9 


1247 


UPF0006 


Wetalloenzyme of unknown 
function 13PF0O06 


6.3e-6l 


215.8 


1248 


Glycos trans 
f_2 


Glycosyi transferases 


4 .5e-10 


46.9 


124S 


efhand 


EF hand 


4e-li 


SO. 4 


1254 


UQcon 


Ubiqui tin- conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


tormyl trans 
f 


Formyl transferase 


4.5e-30 


108.3 


1255 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHfolate_re 
c 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G glu transp 
ept 


Gamma -glutamyl transpeptidase 


1.8e-110 


380.4 


1262 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86 .9 
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SEC ID 
NO • 


FFAM NAME 




p- va iuc 


PFAM 
SCORE 


126< 


SCF 


SCP-lixe extracellular protein 


6e-25 


108 . 0 


12 6'/ 


K_tet ic. 


K-* channel tetrameri sa t ion 
comair 


2 . 8e - 2 7 


104 . 0 


J. £ D _ 




p a c: ♦ : mi 1 V 


1 3e - 8 b 


297 <5 


127S 


*7 f - CI i-Ti 

z i - (-Jr.LS 


i, 1 1 1 1 - i ^ i j m «~ x , v — ? h i_ vp c \ r% _i i\ o 


4 . 2e - 1 C 


37 C 


1 2 7£ 


ojujjyi^x u o o t 


aloha/oeta hydrolase fold 


5 . 4 e - 2 3 


89 . 6 


127'/ 


abhydrolase 


aipha/beta hydrolase fold 


5.6e-21 


83.1 




x. i y p sjii 


Trvps ^ t 


4 4 e - 4 11 


132 . 0 


i o n r. 

1 X O L 


PBP 


DK\/-^cr>iViof "i/~1wlc~^Vi:^inr^~i ami no. 
r I IC i>piici I. J. uyj fc l ndilOJ-aiillilC:- 

Ki nHi no D^ntpin 


1 3e- 3 3 


58 . 7 


I28t 


zt -C3HC4 


X- nr f inaer ClHr*4 t\me (RING 
f inoer ) 


5 - 6e- 14 


4 9-6 


128: 


ank 


Ank repeat. 


1 . 7e - S2 


187.8 


1294 


fn3 


Fi r'KriTi**^ h i r» hvnp TTT ^oms i r 


0 . 026 


20.9 


129b 


GDP 


»f~v\f T a t" f» «• V\> r» c\ \ wet rwrvJ 1 Pin 

UUDUyiOVC L>Al.VJiliy ^11 UlC 111 


V * V) V> w v> 


^70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Cl3udin family 


6. 9e-41 


149.3 


12 97 


Rhodnnese 


— — ■■ » -j ; — 

Rhodanese - 1 ike domain 


3 . 2e - 1 4 


£n i 

t)U . / 


129* 


LIM 


Lltt domain containing proteins 


5.8e-2l 


79.1 


1301 | rnaseA 


Pancreatic ribonucleases 


4 . 9e-4 3 


14 5.2 


13 0 7 ( mi to^caiT 


Mitochondrial carrier proteins 


2.le-5> 


1B6.0 


130b 


WD40 


WD domain, G-beta repeat 


1 . 6e- 1" 


71 . 6 


131C 


UPAR_LY6 


u-PAR/Ly-6 domain 


7 . le- 2C 


75 . 5 


I31i 


t hiorec 


Thioredoxin 


3 . 6e-05 


21 . 6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


l.Se-6"; 


237.9 


1316 


trypsin 


Trypsir 


4 . 4e-41 


132 . 0 


1320 


RiboBomal M 
3 


Riboscmal protein L13 


3 . 9e- 6i 


219.8 


1327 


Armadill o_se 

g 


Armadillo/bet a -catenin-likf 
repeats 


0. 0054 


23.4 ; 


13 2 h 


KRAB 


KRAB bo> 


0. 052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.014 


-1.6 


1333 


PX 


PX domain 


2.1e-lC 


48 . 0 


1333 


KRAB 


KRAB bo 


1 . 8e-36 


134 .6 


1334 


UPP_6yntheta 
Be 


Putative undecoprenyj 
diphosphate synt 


2 .3e-8S 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1 . 8e-59 


211 . 0 


1336 


PSPc 


Dual specificity phosphatase, 
catalytic doma 


1 . 2 e - 3 1 


1 1 Q C 
X 1 o . b 


1337 


DSPc 


Dual specificity phosphatase, 

LaLdiytiv. uuiiu 


^ . ^e ' X*. 


<^4 C. 
J* . D 


1338 


TPR 


TPR Domain 


0 . 00021 


28.1 


134C 


metalthio 


M*» r a 1 ) 1 n^hl on pin 


0.013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


1343 


Band 4 3 


PFPK nnm« ■> n /Rand 4 1 f ami 1 v) 


1 . 3e-38 


122.5 


1344 


Kelch 


Kei ch motif 


1.4C44 


161.5 


134b 


Antifreeze 


mil. i l x cexe j/iuvc jiij 


1 2e-10 


4 B . 8 


1347 


3Beta_HSD 


j'Ucto nyaiyAyBUciuiu 
fi^hvdroapnafip / 1 fiomera 


0,086 


-177.2 


1348 


BTB 


RTR / P07 rinm» i n 
u au / rv/u uuuw 


5 ,3e'28 


106 . 5 


1349 


DUF6 




0 . 03 3 


15.8 


1350 


myosinbead 


Myosin head (motor domain) 


"0 


108 8.7 


1352 


Nramp 


Natural resistance-aseociated 
macrophage pro 


1.2e-202 


686.6 


1 1353 

! 


S_30O 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3 .6e-65 


209.0 


| 1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4 .2e-57 


203 .1 


1360 


ZI-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481 .4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-4C 


145 . 7 
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SEC ID 

NO : 


?FAM name 


DESCRIPTION 


p- value 


PPAM 
SCORE 


2 3 6i 


SIS 


SIS coma an 


3 . Be - 3 C 


113.6 


1 2 6 I* 


SIS 


SIS oomoir. 


1 . 3 e - 2 6 


108 . 5 




ig 


lmmur.ogl obul in domain 


0 . 0002 6 


19.0 


13 6 8 


K_tet ro 


K+ cr.annel t etramerisatior. 
com2i.. 


1 . 1 e- 1 6 


68.9 


13 71 


— « » 

Col lsgen 


Collagen triple helix repeat 
(20 copies) 


2 . 2e - 113 


390 . 1 


13 72 


DnaO 


Dna^j domain 


ft Co. IC 


132 . 7 


13 7 6 


KRAI? 


VDTiD 


2 . le - 3 8 


141.0 


1 3 V H 


EL»M2 


^ uuiiidiii 


2e - 2 *• 


91 . 3 


1~5 £ C 


t hior ed 


Thior eaoxi r. 


1 . 2e - 23 


82.8 


13 83 


an)i 


Ank repeat 


2 . 3e- 8 3 


2 90 . 4 


lit* 


BTB 


3TB /POZ domain 


3 e - 1 j 


SO . 8 


1 3 P3 


WD4 0 


WD oomain, G-beta repeat 


1 . 6e - 1 9 


78 . 3 


13 64 


WD4 0 


WD comain, G-beta repeat 


6 . 3e- 24 


92 . 9 


13 6"/ 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
f inger ] 


1 . le- 09 


35 . 6 


1 3 6 S« 


zt -C2K2 


Zinc finger, C2H2 type 


5 . 5e - 5C 


179 . 5 


13 50 


zf -C2H2 


Zinc finger, C2H2 type 


2 . Se-85 


296.9 


1393 


kinesin 


Kinesin motor domain 


7 . 8e~ 188 


637 . 4 


1354 


zf -C2H2 


Zinc finger, C2H2 type 


1 . 2e-49 


178. 4 


13 56 


KRAB 


KRAB box 


5.1e-22 


86.6 


14C; 


bZIP 


bZIP transcription factor 


0 . 03S 


13 .1 


14C5 


sugar _tr 


Sugar (and other) transporter 


0. 003 


-101 .5 


14CC 


RhoGAP 


RhoGAP domain 


8 . 9e-47 


168. 6 


140/ 


rrm 


RNA recognition motif. 


In- lib 


132.1 


i4oe 


LRR 


Leucine Rich Repeat 


2.1e-13 


S8.0 


140S 


Nebulinrepe 
at 


Nebulin repeat 


6e-54 


192.6 


141C 


ank 


Ank repeat 


1. 6e-17 


71 .6 


1412 


Riboson\al_L5 
_C 


ribosomal L5P family C-terminus 


8. 2C-58 


205.5 


141E 


trypsin 


Trypsin 


4 . 7e-85 


270.4 


1416 


aninotran_l 


Aminotransferases class-I 


4.4e-05 


-91.2 


1417 


SI 


Si RNA binding domain 


1 . 6e-C7 


33 .1 


1419 


WD4 0 


WD domain, G-beta repeat 


2 . 2e-0P 


44 .6 


1422 


cadherin 


Cadherin domain 


8.3e-42 


152.3 


1424 


SH3 


SH3 domain 


2. Se-80 


280.3 


142b 


PHD 


PHD- finger 


3 ,2e-17 


70.6 


1426 


PHD 


phd- finger 


3 .2e-17 


70.6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1426 

- 


helicase_C 


Hel icases conserved C- terminal 
domain 


le- 26 


102.2 


142 9 


WD4 0 


WD domain, G-beta repeat 


3 . 9e-07 


37 .2 


14 3 0 


inosi tol_P 


Inositol monophosphatase family 


2 . Se- 10 


4 0.2 


1431 


raito carr 


Mitochondrial carrier proteins 


4 . 3e-83 


287.7 






Clq domain 


2 . Se-16 


66.2 


14 34 


WD4 0 


WD dorr.am, G-beta repeat 


1 . 6e- 13 


58.3 


14 3 5 


Inos- 1 - 
p synth 


Myo- inosi tol-1 -phosphate 

o \r~r~i 4- V\ ^ cr 


7 e - 2 2 8 


770 . 4 


1436 




t\rit\ rctogniiion moLii. 




128.3 


14 3b 


10 


X mniunOOi uDUXXIl ClUUlcl 1. 11 


1 3e-12 i 


45.6 


144 C 


GAdapt^CT 


Gamma -adapt in, C-terminus 


J .SCO/ 


236.7 


1441 


G_Adapt_CT 


Gamma -adapt in, C-terminus 


3 4 e - 6 7 


236 . 7 


144 3 


Kelch 


Kelch motif 


U , UUUi j 


« 9 - r 


1446 


ARID 


ARID DMA binding domain 


1 . 8e-21 


84 .7 


1447 


zf-C2H2 1 


Zinc finger, C2H2 type 


9 .4e-28 


105.6 


1446 


AMP-bindlng 


AMP-bmding en2yme 


2 .6e-07 


-145. 1 


1451 


rrm 


RNA recognition motif. 


6.5e-2l 


82.9 


14S4 


i3 


Immunoglobulin domain 


5 6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransferase family 


5.4e-21 


83.2 


146C 


Aldose_epim j Aldose 1-epiraerase 


1 .9e-3£ 


131.2 


1461 


C2 | C2 domain 


4e-16 


73.6 


147C 


TIG | IPT'TIG domain 


3 .le-19 


77.3 


1472 


PseudoU_synt j RNA pseudouridylate synthase 


4 .3e-16 


66.9 
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SEO ID 
NO: 


PFAM NAME | DESCRIPTION 

i 
i 


p- value 


PFAf* 
SCORE 




h - 2 i 






14 7<, 


DEW 


DENN (AKX-3) donxa-n 


1.3e-44 


161 . t 


1475 


Cat i on_ ef f lu 

X 


Cation efflux family 


4 . 6e-42 


176 .4 


1477 


TBC 


T3C domain 


8e-47 


169 . 0 


14 7£ 


rrm 


RNA recognition motif . 


2e-2i 


84 . 6 


140C 


1C 


Immunoglobulin domain 


5.5e-06 


24 .3 


1484 


Telo_bind_al 
pha 


Telomere -binding protein alpha 
subuni 


0.028 


-225.9 


148S 


zf-C2H2 


Zinc finger, C2H2 type 


1 . 8e-68 


240.9 


148b 


pkmase 


Eukaryotic protein kinase 
domain 


9.5e-13 


49.9 


1486 


hel icase_C 


Helicases conserved C-termanal 
domain 


1.4e-l5 


6S.2 


1482 


D13F89 


Protein of unknot function 
DUF89 


0.079 


-132 .4 


1490 


ECH 


Enoyi-CoA hydra Cose/isoniera3e 
family 


5.2e-41 


149.7 


1491 


guanylate_ey 
c 


Adenylate and Gu^nylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3 .4e-l9 


77.2 


1495 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7. le-10 


36.3 


149"? 


pkmase 


Eukaryotic protein kinase 
domain 


le-22 


85. fc 


1500 


SH3 n 


SH3 domain 


9.3e-05 


27.2 


1502 


homeobox 


Home ob ox domain 


0 .084 


13 . G 


1503 


homeobox 


Homeobox domain 


0. 084 


13.8 


1505 


EGF 


EGF- 1 ike domain 


2 .7e-23 


90 . fc 


1506 


UCH-2 


Ubiquitm carboxyl - terrrn nal 
hydrolase family 


2 .7e-21 


84 . 2 


1508 


Pept idase_M2 
0 


Peptidase family K20/M2S/M40 


2 . 8e-28 


101 .e 


1511 


PX 


PX domain 


1 .9e-ll 


51 .5 


1512 


Sulf atase 


Sultatase 


2 . 8e-35 


130.7 


asie 


Syntaxin 


Syntaxin 


0 . 011 


-62 . 3 


1518 


aminotran_3 


Aminotransf eraoeo class- 1 II 
pyridoxel -pho 


9.7e-106 


305.6 


1520 


*9 


Immunoglobulin donain 


0.075 


11 . 0 


1521 


RA 


Kas association (RalGDS/AF-6) 
domain 


0. 01 3 


13 . 3 


1S23 


RhoGAP 


RhoGAP domain 


2 .5e-05 


18.7 


1528 


WD40 


WD domain, G-beta repeat 


S.4e~24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7,8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger 


3 .2e-27 


101 .5 


1S39 


DAGKC 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36 . 5 


1540 


Ocularolb 


Ocular albinism type 1 protein 


0 


1184.7 


1653 


SAP 


SAP domain 


6e-06 


33 . 2 


1654 


Ami no_ox leas 
e 


Flavin containing amine oxidase 


3.2e-43 


157 . C 


1655 


Aminooxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157 . 0 


1656 


RhoGEF 


RhoGEF domain 


1 . 4e-24 


95 . 1 


1657 


MMR HSR1 


GTPase of unknown function 


0.0011 


-45.5 


1659 


m2 


Ubiguitin carboxyl - terminal 
hydrolase family 


2.5e-ll 


51 . 1 


1660 


actin 


Act in 


6 .6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


VTO40 


WD domain, G-bete repeat 


1.4e-67 


237 .9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


Noll_Nop2_Su 
n 


NOLI /NOP2/ sun family 


1.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


5.4e-l5 


46.9 
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SEO ID 

no 


PFAM NAM I 


DESCRIPTION 


p- va;ue 


PFAM 
SCORc 


167; 


chromO 


'chromo' < CHRroir«t l n 
Organization Modifier) 


2 .le-18 


67 . 7 


167- 

1 


z f - ceo: 


Zinc finger C-x8 -C-x5- C-x3 - H 
t:ype 


0.002S 


17 . £ 


167( 


Glyco_hydrc_ 
47 


Giycosyl hydrolase family 4'. 


1.8e-187 


636 .2 


1677 


Glyco_hydrc_ 
47 


Giyccsyl hydrolase family 4 7 


4 .5e-74 


259.5 


1680 


WD40 


WD domain, G-*beta repeat 


l.lc-27 


105 . b 


168i 


K04 0 


wb domain, G-beta repeat 


1 .le-27 


105 . 5 


1683 


MMR_HSRi 


GTFese of unknown function 


1 . 8e- 78 


274 . 1 


1691 


rrm 


RNA recognition motif . 


1 . Be- 37 


137 . S- 


1652 


rrm 


RNA recognition motif. 


1 . 8e-37 


137 . 9 


16S1- 


AAA 


ATPases associated with venous 
cellular act 


i.3e-8i 


284 . 5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8 .4e- 82 


285 . 7 


1696 


Ferricreduc 
t 


Ferric reductase like 
transmembrane com 


3 .5e-53 


190 . : 


1699 


zt -C2H1 


Zinc finger, C2H2 type 


4 .4e-34 


126 . 6 


170C 


arf 


ADP- ribosylation factor family 


9e-19 


75.8 


1701 


GTP_EFTU 


Elongation factor Tu family 


0.014 


11.4 


170? 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


1707 


pkinaee 


Eukaryotic protein kinase 
domain 


1.2e-B8 


307.9 


1709 


WD4 0 


WD domain, G-beta repeat 


0.0035 


24.0 


171C 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115.3 1 


1711 


WW 


WW domain 


7.GC-12 


52. e 


1712 


ank 


An*, repeat 


4 .2e-34 


126 . 7 


1713 


zf -CCCH 


Zinc finger C-x8 -C-x5- C- x3 - K 
type 


2.6e-09 


38.3 


1714 


zf -CCCH 


Zinc finger C-x8 -C-x5 - C- x3 - h 
type 


2.6e-09 


38.3 


171£> 


ras 


Ras family 


4 . 4e-41 


149. 9 


171C 


HMG_box 


hmg (high mobility group) box 


8 .3e-21 


82 .6 


1719 


TBC 


TBC domain 


1 . le-45 


165.2 


1723 


KLH 


Helix-loop-helix DNA-bmding 
domain 


9 .2e-10 


45.9 


1723 


C3TIT1 


Double -stranded RNA binding 
motif 


2.9e-05 


30.9 


1721 


RrnaAD 


Ribosomal RNA adenine 
dinethylases 


0 . 045 


9.2 


1725 


C3DE-N 


C1DE-N domain 


5 .9e-40 


146 .2 


1726 


HAT 


HAT (Hal f - A-TPR) repeats 


2.9e-44 


160 . b 


1728 


ef hand 


EF hand 


5 .le-20 


79.9 


1733 


Hist_deacety 
1 


Ha stone deacetylase family 


1.7e-l04 


360 . 6 


1735 


LRR 


Leucine Rich Repeat 


4 . 6e-34 


126 * 6 


1739 


Pl-PbC-X 


Phosphatidyiinositol-speci tic 
phosphol ipase 


0 .0023 


16 . 1 


1743 


ras 


Ras family 


3 . 7e - 10 


- 4 X . J 


1744 


ras 


Ras family 


j , /e- iu 


oi -a 
- 41 . c. 


1745 


RasGEF 


RasGEF domain 


1 O^. A Q 

3 - y 


l f b . y 


1746 


adh_short 


short chain dehydrogenase 


7 . le- 08 




1751 


ZI - C2H2 


Zinc finger, C2H2 type 


9e-39 


142.2 


1754 


f n3 


Fibronectin type III domain 


5 . 5e-101 


1 A A O 


"1756 






6 .3e-93 


322 . 1 


1758 


rm 


RNA recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA binding domain 


6.le-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1C-95 


328.8 


1765 


MMR_H$R1 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CNhydrolase 


Catbon-nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4 .le-07 


37.1 


1779 


Oxysterol^BP 


Oxysterol -binding protein 


4.7e-56 


199.6 


17 83 


RhoGEF 


RhoGEF domain 


l.6e-23 


91.6 


"1784 _ 


RhoGBF 


RhoGEF domain 


1.6e-23 


91.6 
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SEO ID 

NO: 


PFAM NAM I 


| DESCRIPTION 


P- 


value 


PFAK ] 
SCORF ; 


178* 


Xrm 


[ RNA recognition mot i 1 


6 


4e-14 





TRADOCS: 14 1 6227.] (V.CRNC 1 V DOC) 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/533 J 2 
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1 ABLE t 



cpo ID NO- 

T 


POSITION OF 
SIGNAL IN AMINO 
ACID SEOUENC?- 


M-vC (MAXIMUM 
SCORE) 


MeanS (KEAN \ 
SCORE) 


1-2: 


0 .991 


0.955 


^. 


1-33 


0.99S 


0 .944 


3 


1-33 


0 .94 9 


0.736 


4 


1-19 


0 .970 


0.951 


& 


1-26 


0 .971 


0 .863 


6 


1-26 


0 .971 


0.863 


7 


1-26 


0 .97] 


0 .863 


8 


1-26 


0 .971 


0 .8T3 


9 


1-46 


0.982 


0.901 


10 


1-22 


0 .991 


0 .955 


11 


1-23 


0 .989 


0 .899 


12 


1-25 


0.955 


0 . 803 


13 


1 -1 1 


0 .932 


0 . 625 


14 


1-16 


0.938 


0.876 


1 5 


1 - 2 5 


0 . 94 1 


0.811 


16 


1-17 


0.972 


0.539 


17 


3-27 


0 . 96 4 


0.777 


1 8 


1-16 


0.914 


0.657 


1 9 


1 - 1 *♦ 


0 . 95 


O 84 0 


■5 n 

6 V 


i - 9 r 
j - u 


0 y3 5 


n irn 

VJ . IV X 


4 J. 


a - /. 


0.974 




TO 


i jj 


a on "i 
u . yu ji 


0.895 




1 - 1 V 


0 . 991 


0.959 


->A 
CH 


1- 3 _ 




O 944 


-} c 

Z Ti 




0 . 97 £ 


v . y j b 


i b 


JL ~ * ' 


0 . 996 


0 . 92 8 


t 1 


1-24 




0 . 73 9 


2 8 


1 - 2 2 


0 . 906 


0.688 


29 


J. - 3 T 




0 . 84 1 


30 


1-2 8 


fk C Q D 


0.893 


3 1 




n qq ■> 


n onr 
U ,y iv 


3 2 


1 - 2 2 




0 . 909 


3 5 


1-33 


0 . °4 9 


0 73 6 


36 


1-33 


0.949 


0.736 


4 6 


1-19 


0 . 570 


0.951 


67 


1-2 5 


0 . $6 £ 


0.848 


7 1 


1 - 1 fc 


0 . S4 9 


0.845 


72 


1 -? 0 


0.991 


0.919 


75 


1-29 


0 . 958 


0 . 854 


88 


1-20 


0.986 


0 . 94 5 


94 


1-33 


0 . 994 


0 . 943 


97 


1-46 


0 . 964 


0 .595 


103 


1-49 


0.983 


0 .570 


108 


1-26 


0.97 8 


0.885 


1X1 


1-23 


0.969 


0.899 i 


126 


1-25 


0.955 


0.803 


129 


1-1S 


0.963 


0.918 


13B 


1-29 


0.971 


0.844 


143 


1-18 


0.934 


0.628 


14 8 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0 .927 


160 


1-17 


0.972 


0.939 


161 


1-46 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


l-ie 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.94 5 


0.825 


180 


1-27 


0. 981 


0.941 


187 


1-26 


0.982 


0.936 


190 


1-15 


0.953 


0.840 


196 


1-22 


0.975 | 


0.916 


197 


1-22 


0.963 


0.936 
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BNSDOCID. <WO 0153312A1J_> 



WO 0 J .'533 J 7 



J > CT/US00/34263 



SEQ ID NO : 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


K<-.xS (MAXIMUM 
SCORE} 


MeanS (MEAN 
SCORE) 


J 99 


1-2C 


C .52: 


0. 701 


2 00 


1-23 


G . 5 7 7 


0.773 


2 06 


1-30 


0 .5 64 


O.890 


207" 


i-15 


C .55i 


0. 924 


2 06 


1-22 


0 .574 


0.850 


21G 


1-40 


0.54 6 


0.670 


211 


1-23 


0.571 


0 . 849 


21€ 


1-24 


C.586 


0.956 


,>lb 


1-33 


0.56i 


0.895 


:is 


1-19 


0.57( 


0.871 


| 221 


1-19 


0 .504 


0.553 | 


222 


1-21 


o.s:7 


0.555 


230 


1-15 


0 .550 


0 . 959 


231 


1-26 


0 . 5 $ :* 


0.800 


232 


1-25 


j c .set 


0.826 


235 


1-23 


0 .565 


0.828 f 


240 


1-17 


0 .96: 


0.955 


241 


1-17 


0.58> 


0 .955 


245 


1-30 


C.57C. 


0.722 


24 e 


1-22 


0 .976 


0.935 


245 


1-23 


0.56* 


O.940 


2S2 


1-18 


0.9/: 


0.923 


261 


1 -24 


o.be:- 


0.587 


265 


1-18 


0 .53 5 


0. 86B 


272 


1-24 


0.553 


0 . 739 


283 


1-21 


0 . 50< 


0 .686 


284 

290 


1-29 


0 . 597 


0 . 854 


1-31 


0 . 5 & f 


0 . 84i 


302 


1-2 8 


0 . 58C 


0 . 893 


304 


1 - 1 b 


0.507 


0 .635 


3 1 2 


1-19 


0.953 


0.976 


313 


1-17 


0.93 0 


0 .753 


323 


1-22 


0 . 59* 


0. 909 


3 24 


1-17 


o . s e : 


D . 954 


328 


1-19 


0.57; 


0. 665 


329 


1-22 


0.563 


0 . 924 


3 3 0 


1-33 


0 .576 


o.e4i 


331 


1-24 


0. 52C 


0 . 712 


332 


1-24 


0 .575 


0 .883 


333 


1-19 


0.S6': 


0 . 541 


334 


1-20 


0.055 


0.567 


33b 


1-2-7 


0 .54 2 


0 .813 


336 


1-20 


0 . S 


6 . 850 


337 


1-38 


0.54 2 


0.653 • 


336 


1-27 


o . 5 3 


0.772 


339 


1-36 


0 . S75 


0.804 


34 0 


1-27 


0. 686 


0 . 597 


343 


1-19 


0. 9-,- 


0.865 


344 


1-22 


0. 95' 


0.928 


345 


1-17 


0.9* ( 


0 .687 


346 


1-19 


0.92C 


0. 822 


347 


1-22 


o.9e: 


0.924 


349 


1-24 


0. 962 


0.966 


351 


1-21 


o.snt 


0.815 


3 52~ 


1-31 


0. 96* 


0 . 912 


354 


1-31 


0. 974 


0.835 


355 


1-29 


0.932 


0.632 


356 


1-15 


0.594 


0.965 


357 


1-33 


0. 935 


0.726 


360 


1-27 


0. 93f 


0.82* 


363 


1-25 


0. 954 


0.674 


362 


1-22 


0.925 


0.788 


363 


1-21 


0.86; 


0.715 


364 


1-33 


0.976 


0. 841 


365 


1-33 


0.976 


0.841 
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r SEQ ID NO: 


POX IT J ON 


OF ; 


MfcxS {MAXIMUM 


tteanS (KL'AN 




SIGNAL IN 


AMI NC 


SC0R3) 


SCORE) 




ACID SEOUENCF 1 








- 

3fct 


1-21 




0 . 51 1. 


C 


8 20 


3 6'- 


1 - 1 r 




L» . 3 J t 


o 


822 

out , 


3 66 


1-25 




u . y / < 




8 74 


3 7 C 


1-24 




U . J/. L 


n 


712 


3 1 - 


1-24 


n Q£L "i 
U . 7b J 


0 . 


773 


372 


1-27 


0 9 1 c 


0 . 


"768 


373 


1-15 


0 98 6 


o 


945 


37 1 


1-32 




^ . ✓ r 


o 


932 


3 7 1 


1-34 


0 yfi/ 


o 


81C 


37'/ 


1-1"/ 


0.99- 


o 


9 50 


3 76 


1-45 


0.973 


o 


74 5 


38C 


1-20 


0 . 96 8 


0 


8 74 


383 


1-2C 


0.926 


o 


1 fly- 

_| 


392 


1-15 


0.986 


0 


934 


3 8 3 


1-28 


0.965 




8 25 


384 


1-39 


0 . 9 7 C 


0 


-? 51 j 


38e 


1-24 


0.971 


0 


8 8] 


386 


1-3C 


0.985 


0 


8 66 


3 8$ 


1-15 


0 . 984 


0 


941 I 


3 9C 


1-26 


0 .97] 


0 


782 


39^ 


1-20 




0 .98] 


0 


900 


393 


1-16 




0 .968 


0 


890 j 


394 


1-23 


0 .93? 


0 


7 0] 


3 97 


1-22 


0.985 


0 


854 


395 


1-46 


0 .977 


0 


656 


401 


1-20 


0.895 


0 


567 


402 


1-22 




0 .96 7 


0 


931 


4 03 


1-27 


0 .992 


0 


934 


4 04 


1-19 


0.991 


0 


973 


40b 


1-23 


0.994 


0 


921 


4 07 


1-35 


0.987 


0 


658 


408 


1-35 


0 .9 7fc 


0 


551 


4 05 


1-33 


0 .897 


0 


570 


410 


1-25 


0.99C 


0 


962 


411 


1-38 


0.977 


0 


827 


412 


1-20 


0.944 


0 


768 


413 


1-20 


0.988 


0 


965 


414 


1-46 


0.993 


0 


638 


41E 


1-23 


0.96: 


0 


94 0 


417 


1-29 


0 .941 


0 


672 


418 


1 20 


0. 952 


0 


G 5C 


415 


1-19 


0 . 986 


0 


OCT 


4 20 


1-29 


0 . 965 


0 


b 6 i 


4 21 


1-22 


0.885 


Q 


7 85 


422 


1-46 


0.982 


o 


662 


4 24 


1-19 


0.975 


o 


933 


4 26 


1-38 


0.942 


o 


653 


— — 

4 JO 


1-18 


0 . 947 


o 


595 


4 32 


1-33 


0.55/ 


o 


789 


A "5 "i 

n o J 


1-26 


0 . 979 


o 


904 


A 1 C 


1-27 


0 . 96 2 


o 


777 


435 


1-24 


0 . 996 


o 


977 


** J D 


1-27 




o 


772 


4 4 J 


1-15 


0.966 


0 


940 


A A £ 


1-36 


0.975 


0 


804 


453 


1-41 


0 . 958 


0 


609 


455 


1-33 


0.943 


0 


606 


45-5 


1-27 


0.888 


0 


597 


462 


1-16 


0.S25 


0. 


681 


4 86 


1-27 


0.972 


0. 


845 


495 


1-24 


0.917 


0. 


636 


498 


1-26 


0.993 


0 


890 


505 


1-20 


0.976 


0. 


926 


507 


1-17 


0.966 


0. 


687 


510 


1-23 


0.930 


0 . 


593 
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BNSDOCID: <WO. 



.01S3312A1J..: 



WO 01/53312 PCT/l)S«»/34263 



SEQ ID NO: 


POSITION OF 
SIGNAL IK AM INC 
ACID SEQUENCE 


MaxS (MAXIMUM | MeanS (MEAN 
SCORE) j SCORE j 


...... 

Si . 


1-23 


0 .93 0 • 0 .SSI* 


S3 Z 


1-23 


0 93 0 


0 591- 


S 1 L 


1 - 1 & 


0 97 8 


0 . 9 5 ( 


~> 2 


1-19 


0.936 


0.822 


52' 


1-22 


0.263 


0.524 


54 ^ 


1-24 


0 98 2 


0 . 26( 


^ c r 


1 - 3 C 


0 93 3 


0.713 


CC' 


1 - 2 2 


0.973 


0.211 


5$4 


1-23 




0 7 b c 


57 j 


1-21 


0 91 & 


0 8 3- 


5 74 


1-31 




0 . 912 


580 


1-39 


0 c 2S 


0 55t 


594 


1 - ? 1 


0.974 


n o. i 1- 


60£ 


1-2 5 


0 . 93 2 


U . b^* 


609 


1 - 2 9 


0 . S3 2 


0.632 


610 


1-21 


0.990 


0.948 


621 


1-15 


0 . S94 


0.962 


621- 


1-33 


0.5Jb 


0 . 726 


6 5? 


1-27 


0 .938 


0 . 621 


666 


a- 2; 


0 . 92 9 


0 . 788 


67' 


l- it 


0 . 94 6 


0 . 807 


6Bh 


1-2: 


0.861 


0 . 71i: 


699 


1-22 


0.97 b 


0 . 81* 


702 


1- 31 


0 .56 8 


0. 898 


707 


1-10 


0 .860 


0 . 561 


71 3 


1-21 


0.966 


0 . 743 


vie 


1-19 


0 .936 


0 . 822 


715 


1-20 


0 .961 


0. 824 


729 


1-29 


0.972 


0 . 874 


73 b 


1-46 


0 .903 


0.596 


74f 


1-14 


0 . 916 


0. 731 


747 


1-2; 


0.965 


0 . 871 


74 8 


1-29 


0.968 


0 . 765 


75S 


1-24 


0.961 


0. 772 


76 7 


1-27 


0 . 919 


0 . 768 


768 


1-33 


0 . 900 


0 . 585 


7 73 


1-42 


0 . 959 


0 . 702 


779 


1-19 


0 . 986 


o . S4i 


797 


1-15 


0 . 944 


0.759 


798 


1-19 


0 . 900 


0 . 56£ 
0 . 95C 


820 


1-17 


0 . 995 


827 


1-49 


0 . 971 


0 . 749 


84 8 


1-20 


0 968 


0 874 


864 


1- 2C 


ft Q~>S> 


0 782 


866 


1-19 


0 986 


0 934 


8 73 


1-22 


0 . 948 


0.886 


881 


1- 2 6 


0 965 


0 . 825 


887 


1-39 


0 . 970 


0 . 553 


/ 




0 . 985 


0 . 868 


93 4 


1 » 4 £ 


0 . 988 


0 . 777 


93 9 


* J J 


0 . 994 


0 . 88S> 


94 4 


i. - z t 


0 .971 


0. 782 


950 


1-2 9 


0.957 


0. 845 


963 


1-2C 


0 . 981 


0. 900 


964 


1 - 2C 


0 .886 


0.556 


973 


1-16 


0 . 968 


0. 890 


980 


I 1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.822 


984 


1-12 


0. 938 


0.78C 


1015 


1-22 


0.985 


0. 854 


1040 


1-46 


0.977 


0.696 


1052 


1-18 


0. 969 


0.842 


1059 


1-20 


0.927 


0 .867 


1065 


1-33 


0.983 


0.916 


1069 


1-22 


0.993 


0.935 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 PC'J/USMO/34263 



f 



SEQ ID NO: 


POSITION Q\ j MaxS {MAXIMUM | NeanS (MEAN 
SIGNAL IN AMINO j SCORE) | SCORE) 
ACID SEQUENCE j \ 


1075 


i-2'. | C992 


0 . 934 


1080 


1-15 


C . 93 1 


0 . 829 


1092 


1-15 


\J . 3 3 -l 


0 . 973 


10 94 


L - 4 ( 


rt QQ9 


0 . 653 


1095 


1-3C 


0.974 


C . 929 


1105 


1-23 


u . yy** 


0.921 


1123 


1-35 


rt 0 0 1 


C . 6 58 


1138 


1-32 


0 . 954 


0.613 


1140 


1-36 


0.989 


0 . 789 


1142 


1-33 


0.897 


0 . 570 


1152 


1- 2i 


0 . 99 0 


0 . 962 


1170 


r 1- 3 e 


0 . 977 


0.827 


1176 


1-20 


0 . 94 4 


0.768 


1187 


1-20 


0 . 988 


0 . 96 5 


1189 


1-35 


0 . 967 


0.839 


1152 


1-4C 


0.993 


0 . 638 


1193 


1-10 


0.925 


0 . 710 


1197 


1-2? 


0.585 


0 . 853 


1206 


1-23 


o.sei 


0 . 940 


1225 


1-29 


0.941 


0. 67^ 


1245 


1-15 


0.586 


0. 567 


125e 


1-25 


0.565 


0. 86i 


1265 


1-22 


0. 869 


0.785 


1266 


1- 2C 


0.944 


0. 809 


1276 


l-4fr 


0.962 


O. 862 


1292 


1-19 


0.979 


0. 933 


1296 


1-21 


0.98< 


0. 944 


1297 


1-1? 


0 .994 


0. 953 


1332 


1-3F 


0.942 


0 . 653 


1358 


1-10 


0.947 


0.595 


1371 


1-33 


0 .957 


0.789 


1380 


l-2b 


0.979 


0.904 


1397 


1-27 


0.962 


0. 777 


1399 


1-23 


0.997 


O.960 


1404 


1-2- 


0.998 


0.977 


1410 


1-15 


0.946 


0. 84£ 


1414 


1-24 


0 .913 


0 . 5se 


1415 


1-1? 


0 . 982 


0 . 929 


1416 


1-12 


0.931 


0 . 893 


1418 


1-3C " 


0 . 933 


0 , 563 


1420 


l-2< 


0 . 881 


0 . 561 


1421 


1-1S 


0 . 990 


0 . 96B 


1423 


1-17 


0 . 968 


0 . 863 


1424 


1-21 


0.885 


0 . 593 


3425 


1-24 


0.913 


0 . 586 


1426 


1- 24 


0 . 913 


0 . 588 


1428 


1 Ol 

1- *rr 


v . yo / 


0 . 895 


143 0 


i. - j*t 


V . J 1 t 


0.819 


1431 


1-26 


0 . 979 


0 . 923 


1432 


1 _ 1 c 
J. - 


n oct 

U . J3 / 


0 . 613 


( 14 33 


1 .O" 

1 .3* 


v . y cx 


0 . 753 


1434 


1-39 


0.983 


0.623 


14 3 5 


1-25 


0.910 


0.631 


143 6 


1-42 


0.988 


0.868 ; 


143 7 


1-22 


0.998 


0 .980 


J. H * 4 


1-20 


0.918 


0.753 


1448 


1-12 


0 .931 


0.893 


1462 


1-18 


0.968 


0 .888 


1490 


1-2C 


0.881 


0.561 


1518 


1-17 


0.960 


0 .863 


1525 


1-23 


0.885 


0 .591 


1547 


1-26 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0 .923 


0.824 


1593 


1-2& 


0.979 


0.923 
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BNSOOCID: <WO__Ot533i2Al_l_> 



WO 01/53312 PCT/US00/34263 



bby iD NU: 


POSITION OF 


MaxS (MAX I MUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1596 


1 - 1 c 


0.929 t 0.70S 


1601 


1 - * (- 


0 .957 


0.613 


1606 


2-22 


0.975 


0.831 


1607 


2-20 


0.97<, 


0.770 


ieoe 


l-3i 


0 - 921 


0.753 


1614 


1-33 


0.969 


0.829 


1616 


1-20 


0.555 


0.869 


162b 


1-35 


0 . 983 


0.621 


1631 


1-25 


0.910 


0.631 


1636 


1-3 3 


0. 697 


0.S91 


1*39 


2-42 


0.986 


0.868 


1645 j 1-20 


0. 527 


0.568 


1647 | 1-17 


0.523 


0 .742 


1648 | 2-22 


0.598 


0.980 



TRADOCS . 1416234. J (%C R%01 ! DOC) 
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BNSDOCIO: <WO 01533t2A1 J_> 



WO 01/5331? PCT/USHU/34263 



TABLE 6 



SEQ 3D NO: 


SEQ It 


bbU XL) Wv. 


SEO 1 D 


rnoriLy • 


SEQ ID i 


Ot iu± I - 


NO : c: 


o f con tig 


far- . 


r*? /- k~ #» t~ niimh/ir 
CIUCKCL J i U«' LL/C X 


NO : in I 




fill "t 


11 J L J> r - 1UC 


ot cnn t i a 


rorr'^^nnndi ncr 


U . S S . N ! 


nuclcot idle 


1 ena t h 


sequenct 


Dect ide 

r c r v 


SEQ ID NO: in 


09/4 88, 72 5 


sequence j 


pept 2 C36 
Eeousnce 




secruence 


priority 
appl i cation 




1 


1787 


3573 


5255 


784CIP2_1 


110: 


2 


1786 


3574 


536C 


784CIP2 2 


2671- 


3 ! 


1789 


357.5 


536i 


784CIP2_3 


411'^ 


4 1 


1790 


3576 


53G 2 


784CIP2_4 


555( 


5 


1791 


3577 


S36 3 


7B4CIP2_5 


55 62 


6 1 


1792 


3578 


5364 


784CIP2 6 


5562 


7 


1793 


3579 


536 5 


784CIP2 7 


£562 


g 


1794 


3580 


5366 


784CIP2 8 


"5562 




1795 


3581 


536 7 


784CIP2 9 


556 j 


1 0 


1796 


3 582 


S36 8 


7B4C1P2 10 


5564 


1 X 


1 IQ'i 
X / 3 I 


3 583 


536 9 


784riP2 11 


556i 


1 o 


1 *7 a o 


3 5 84 


537 C 


784C1P2 12 


56 8S 


13 


1799 


•iroc 


537 - 


7fi4TlP2 1 "i 
' oi< — If* J-J 


5725 


14 


JL OJU 




537 2 




574 i 


15 


1 80 Z 


3 587 


->o / - 


loan di t c 


57 7*' 


lb 


1 802 


Jbob 


S3" 4 




57 7' 


17 


1803 


3 589 


— 1 3 / C: 


7 Dan OO 1 "7 


57 8 *• 


T~~Q 1 

18 


1804 


3590 


537 fc 


/o4Llr« XO 


5792 


19 


1805 


3 591 


537 "7 


TO* DO "1 Q 




20 


1806 


3 592 


537 t 


/o4it». J.r* *u 


cone 


_ 

21 


1807 


3 593 


537 9 


/bflUP/ ZX 


c q n c 

9DU. 


22 


1 80B 


3 594 


538 0 






23 


1809 


3595 


536 j 


~i a a nro o *> 
/o^LIr* Z-J 


5844 


24 


1810 


3S96 


536 J 


ro4Llr< Z1 


58 5 (' 


2S 


1811 


3 597 


536 3 


foflt^irx Zb 


5 ?. . 6 * 


26 


1812 


3598 


536 4 


784CJP2_Z6 


5 97 j 


27 


1813 


3599 


538 5 






28 


1814 


3 600 


536 6 




5 99 L 


29 


1815 


3 601 


53E *7 


/O^LIFa ZJ 


60 05 


3 0 


IBIS 


3 602 


538 6 




6007 


31 


1817' 


J6UJ 


Q 


(O^Wlr* J X 


6007 


32 


1818 


3 604 


53S 0 




60 05 


i "j 
j j 


i m c 


3605 


53S 3 


784CIP2 33 


601 2 




X ozv 


3 606 


5392 


7B4CIP2 34 


6011 




J. o<c J 


3 C07 


535% 


704C1P2 35 


6016 


36 


1822 


3608 


5354? 


784CIP2 36 


6016 


3 7 


1823 


3609 


5395 


7B4CIP2 37 


6018 


"38 


1 824 


3610 


5396 


784CIP2 38 


6016 


39 


1 82b 


3611 


5397 


784C1P2 39 


6016 


40 


1826 


3612 


539b 


784CIP2_40 


6023 


41 


1827 


3613 


5399 


784C1P2_41 


607 V 


42 


1826 


3614 


54 Ot 


784C3P2_42 


608: 


43 


1829 


361S 


540: 


784C3P2 43 


6089 


44 


183C 


3616 


54C2 


784C1P2_44 


611* 


45 


1831 


3617 


5403 


784C1P2_45 


6116 


"46 


1832 


3618 


5404 


784C3P2_4 6 


6130 


■'47 ■ 


1833 


3619 


540S 


784C3P2_47 


6177 


48 


1834 


3620 


5406 


784CIP2_48 


618S 


49 


1835 


3621 


5407 


784C3P2_4 9 


6191 


SO 


1836 


3*22 


5408 


784C1P2_50 


6204 


*1 


1837 


3623 


5409 


784C1P2 51 . 


6204 


52 


1838 


3624 


541C 


784C1P2 52 


6284 


S3 ^ 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784C1P2 54 


6436 


55 


1841 


3627 


54i3 


784CIP2 55 


6442 


se 


1842 


3628 


5414 


784CIP2 56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416__J 


784C1P2_58 


6458 


69 n 


1845 


3631 


5417 


784C1P2_59 


6458 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/5331: PCT/USOO/34263 



SEQ ID KO: 


SEQ 115 


SEQ ID KO: 


SEQ I D 


Priority 


SEQ ID 


of full* 


NO: ol 


of contio 


NC 


docket numbcr_ 


NO : 11; 


ieri9 t hi 


I UI J - 


IJUC1CU1.1 uc 


U L CUIlll y 




U S S N 


7-1 1 1 /— 1 t»r>t* ir*F» 


1 eng t hi 




r> f- r> t i ci p 


SEQ ID NO: in 


09/488 


secuonce 


sequence 




seauence 


pri ori ty 
application 




6C 


184b 


3632 


1 54 16 


764CIP2_60 


6462 


61 


1847 


3632 


54 19 


784CIP2 6li 


6472 


62 


1846 


3634 


; 54 2C 


7S4CIP2__fc2 


6495 


b'j 


184° 


363£ 


542i 


7fi4CIP2_6I' 


6495 


€4 


1850 


3636 


\ 5422 


784CIP2_6<i 


6505 


6S 


1053 


3637 


; 5< 23 


784CIP2_6£ 


6534 I 


66 


1852 


3638 


5* 24 


784C1P26* 


6534 


67 


18 S3 


3635 


5<25 


784C1P2 6", 


654C 


68 


1854 


3640 


5426 


784C1P2 6fc 


6550 


69 


1855 


3641 


1 5427 


784CJP2 65 


6550 


70 


1 856 


3642 


5428 


7 84CIP2 7C 


6592 


71 


1857 


3643 


S429 


784CIP2 73 


6 64 5 


72 


1858 


3644 


£430 


704C1P2 72 


66 71 


73 


1359 


3645 


54 31 


784C1P2 12 


6763 


74 


i860 


364 6 


54 3 2 




6763 


75 


1861 


3647 


54 3 3 


784C1P2 71 


6786 


76 


1 862 


364 6 


54 34 


7B4CIP2 76 


6 8 24 


77 


j. 0 b j> 




c *4 3 5 




6830 


78 


1 ft <A 
1 0 O *i 


£ K.Ct 
J D DU 


51 O D 




6 8 31 


7 c 


1865 


j 0 j 1 




7 Q DO "7Q 


c u 1 0 

b 0 J <l 


BO 


1 cob 




c 4 3 8 




6 834 




185 7 


Jt>aJ 


54 3 9 


/ ()4^<lrZ Da 


68 34 


ft O 


18 66 


3654 


54 4 0 


7 a A f~"7 D "5 R "„ 
/ U4Lir^ 






1869 




54 4 1 


7 C vl /"* 7 DO O ~i 


b 0 J / 


ft/1 


18 70 


J bob 


5442 


/ Ofl Vw If OS 


6 84 3 


8 5 


18 71 


3 657 


54 4 3 




6859 


8 & 


1872 


J boo 


54 4 4 




6 9 1 h 


B 7 


1873 


3 655 


544 5 


/H4V- lrZ 0 / 




86 


1874 


3 66C 


544 6 


/o4v.iri ofc 


69S / 


Q O 


1 0 o c 


J bo J 








96 


1876 




54 4 8 


7B4nDO tin 
/Ofltir Z y U 


b_? / 


J J. 


7 « 7 *7 


J DO J 






6973 


S 2 


1878 


3 664 


54 5 0 




70 0 7 


1 q 4 


j. 0 / j* 






7 OA fT DO Q£ 


7018 




1880 


3 666 


54 52 




7019 




18 81 


3 667 


54 5" 1 


7ftdf TPO 


7020 


9 1 


1 882 


3 666 


54 54 




7020 


97 


1 8 83 


366S 


S4&5 


7A4C1P2 9& 


7021 


9 h 


1884 


3670 


5156 


784C1P2 99 


7023 


95 


188S 


3671 


5457 


784CIP2 100 


7027 


100 


1886 


3672 


5458 


784C1P2 103 


7028 


10J 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1688 


3674 


5460 


7B4CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


10?: 


1891 


3677 


5463 


784tlP2_106 


7035 


106 


1892 


3678 


5464 


784CIP2 107 


7036 


J 07 


1893 


3679 


5465 


784CIP2_106 


7039 


10S 


1894 


3680 


5466 


784C1P2 109 


7043 


109 


ie95 


3681 


5467 


784CIP2 110 


7044 


110 


1896 


3682 


5468 


784CIP2_111 


7046 


111 


1897 


3683 


5469 


784CIP2 112 


7054 


117. 


1898 


3684 


5470 


784CIP2_113 


7061 


112 


1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3666 


S472 


784C3P2 115 


7092 


lit 


1901 


3687 


5473 


7B4CIP2_116 


7094 


116 


1902 


3688 


5474 


784C3P2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 j 


5476 


784CIP2_119 


7111 


119 


1905 


3691 , 


5477 


784CIP2_120 


7123 


120 


1906 


3692 | 


5478 


784CIP2121 


7142 


121 


1907 


3693 j 


5479 


784C1P2_122 


7142 
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BNSDOCIO. <W0 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEC ID NO: 


SEQ JD 


5EQ ID NC : 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: cf 


Of cont : c 


NO . 


docket n\jmber_ 


NO : ir. 


ler.ct.h 


f ul a - 


r:uc3 eot i ce 


of contio 


corresponding? 1 U.S. S.N. 


nuc ) eot idfi 


lenyt h 


jrCcrucncf 


runt 1 r» a, 
k-tr pL I ct 


SEO ID NO: in 


09/488 , 725 




pcpt ice 






priority 
application 




121 


1 908 


3 6 94 


54 8C 


784CIP2_123 


7154 


2 2 3 


1 90S 


3695 


5481 


784CIP2_124 


7160 


3 2 4 


1910 


3 696 


54 82 


784CIP2_I25 


7169 


12 5 


1 911 


3 697 


5483 


784CIP2_126 


7185 ' 


32 6 


1 912 


369£ 


54 84 


784C3P2 127 


7197 


12 7 


1 913 


3 69 5 


5485 


784CIP2_128 


7219 


12 0 


1914 


'37CC 


5486 


784CIP2_i25 


7226 


3 2? 


1915 


3701 


5487 


784CIP2_ 130 


7229 


1 3 C 


1916 


370; 


5486 


783CIP2131 


7234 


2 31 


1917 


3 703 


5489 


784CIP2_132 


7235 


13 2 


1918 


3 704 


-490 


784CIP2_133 


723S 


1 *l 7 


1 qi q 


3 70 5 


54 91 


784CIP2_334 


7238 




a. y^u 


3 706 


54 92 


784CIP2 135 


7247 






1 7l"l t 


54 93 


784C1P2 136 


7261 


13 6 




J / ut 


54 94 


784CIP2 337 


7262 


T 1 "7 
J J / 


1923 


lint 


54 95 


784CIP2 338 


7267 


1 3 E 


192 4 


m r, 
3 / J u 




784CIP2 139 


7272 


13S 


1925 


3713 


54 97 


784C1P2 140 


7273 


14 0 


1926 


3 71 2 


54 98 


7fl4 CI PZ> 1^1 


7282 


14 1 


1927 
i 


3 713 


54 99 


7fi4PlP? 142 


7288 


3 42 


19^8 


3714 


55 00 


/ Oil Li s ^ -J 


7291 


143 


1929 


371 5 


5501 


TOiPTpO 1*4 


7293 


14 4 


1930 


3 716 


5S02 


/ O *t l_ i tr £. jl *» J 


7294 


3 45. 


1931 


3717 


5503 




72 99 


14 6 


3932 


371 h 


5504 




7300 


14 7 


1933 


371S 


5505 




7312 


3 4b 


1934 


372C 


5506 




73 1 3 


14 9 


1935 


3723 


5507 


7 PT P5 t C.fi 


7315 


ISC 


1936 


3722 


55C8 


S C *a \_ X r £ J — ' -1 


731 8 


151 


1937 


3723 


55C9 




7321 


1 5 2 


1938 


3724 


5510 


784CJP2 153 


7330 


... lb -- 


1939 


3 725 


5511 


7R4C1P2 154 


7331 


1 54 


194 0 


3 726 


5512 


784CIP2 155 


7333 


3 Li- 


1941 


3 727 


5513 


784CIP2 156 


73S0 


3 St 


194 2 


3 72 8 


553 4 


784CJP2 157 


7352 


157 


3943 


3 72 9 




7B4CIP2 15fa 


7384 


156 


1944 


3 730 


5516 


784C1P2 15? 


7403 


15i- 


1945 


3731 


5517 


784ClP2_lfct 


7431 


1 6t 


3 94 6 




z>z>\ c 


784C1P2161 


7441 


1 t T 




J (Jo 




784CIP2_162 


7453 


i c: 

K< 


194 8 


3734 


552C 


784CIP2_161- 


7467 


1 6^ 


1 94 9 


3735 


5521 


784CIP2_164 


7471 


1 


3 950 


3 736 


5522 


784CIP2 365 


7493 


16fc 


1951 


3737 


5523 


784CIP2 166 


7502 


3 6 6 


2952 


3 736 


5524 


784CIP2_167 


7511 


167 


1953 


3735) 


5525 


784CIP2_166 


7514 


J 66 


1954 


374C 


5526 


784C3P2 165 


7520 


1€S 


1955 " 


3741 


5S27 


784C1P2_170 


■ 7541 


370 


1956 


3742 


5528 


784C1P2_171 


7570 


171 


1957 


3743 


5529 


784CIP2_172 


7578 


172 


1958 


3744 


5S30 


784C1P2 173 


7583 


173 


1959 


3745 


5533 


784CIP2 174 


7592 


174 


1960 


3746 


5532 


784C1P2_175 


7601 


375 


1961 


3747 


5533 


784C1P2 176 


7602 


176 


1962 


3748 


5534 


784CIP2 177 


7608 


3 77 


1963 


3745 


5535 


784CIP2_178 


7615 


I7t 


1964 


3750 


5536 


784CIP2_17S 


7617 | 


179 


1965 


3751 


5537 | 


784CIP2_181 


7624 


161- 


1966 


3752 


S53B 


784CIP2_182 


r 7626 


1B1 


1967 


3753 


5539 


784CIP2 183 


7640 


162 ■ 


1968 


3754 


5540 


7B4C1P2_184 


7641 


183 


1969 


3755 J 


5541 


784C1P2 185 


7641 
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* 



WO 0J/533I? 



PCT/US00/34263 



SEO 11* NO: 


seq rc 


SEQ ID NC: 


SEQ ID 


Priority 


seo ir 


of f y 1 i ■ 


NO : c>i 


of ccr.Cic 


NC' : 


docket nuinbeT_ 


NO: in 


1 ei.ct li 


full - 


nucl eot. ice 


cf contig 


con espondi ng 


Xj . S . S . N . 


nu'c ipot i <3e 


1 ^ncjt h 




rent "i no 


9KO ID NO- in 


OQ/dfiP, "79c: 
w ? f t a 0 t 1 < z? 




pep t ioc 

S60U6I1CC 




£>e cru © nc6 


pr j orit v 
appl icatior. 






1970 


37 M 


5542 


784C1P2_186 


7642 


2 85 


1972 


3757 


£543 


784C1P2_ 187 


7642 


3 8fc 


1972 


3 75t 


5544 


784CIP2_ 286 


7649 


2 87 


1973 


3 7 5 $ 


£545 


784CIP2 189 


7656 




2974 


3 76 C 


5546 


784CIP2190 


7657 


1 0 C 
■* 


1975 


376 j 


5547 


784CIP2_192 


7657 


I 9C 


1576 


3762 


5548 


784CIP2_ 192 


7662 


i 9; 


19 77 


3 761- 


£549 


784CIP2 193 


7668 


2 52 


1 978 


3 76'. 


£550 


784CIP2 194 


7673 


2 93 


1979 


3 76- 


5551 


784CIP2 195 


7690 


J. - ** 


19 80 


37 66 


5552 


7B4CIP2 196 


7700 


JQl 


1 9S1 


37 6 7 


5553 


7B4CIP2 197 


7709 


1 Q<. 

Jl ~C 


1932 


3 7f 6 


5554 


784CIP2 198 


7736 


"1 CO 


1 jDJ 


3 76 5 


5555 




7737 


2 98 


19 84 


3 7 7 0 


5556 


7R4CTP2 200 


7744 


1 Cl C 

i - 1? 


i one. 


j / / j 




2ft "t 

yo*jk_j.r,t j. x 


7771 


2 OC 


i o o c 
x y Ob 


3 7 7 2 


c 5 58 


0 g A PT P9 ?fi? 


7 786 


/U- 


1987 






9 ft il PT D9 2 fl 


7792 


202 


1988 


3 7/4 


55 6 0 


Td^rT D9 9 fl4 


7797 


2 0 j 


1989 


3 7 7 '• 


bib j. 


la a pt D9 9 r\ 


7806 


2 04 


1 9 90 


377 6 


55 62 


nno 9 A C 
'Hili * v t 


7812 


20£ 


1 991 


3 77'/ 


55 63 




7ftl 9 


2C6 


1 992 


377 J 


5564 




7818 


207 


1993 


37 7 S 


55 65 




7 822 


? 0 6 


1 994 


3 78C 


5566 


O O A /"*TT T"iO OTA 

/ 8 4 C I P2_ ^ 1 U 


7827 


2 09 


1995 


3 76: 


5567 


784 CJ P2_ is 1 J 


783 0 


22 0 


1995 


3 7 £ 2 


5568 


784CIP2_z 1* 


783 5 


22 2 


1997 


3783 


5569 


784CIPZ^214 


784 0 


2 3 2 


1998 


3 764 


5570 


/ 04LIl'^i 1 s 


. . 7858 


2 23 


1999 


378* 


5571 


7B4CIP2_ /Id 


7856 


2 2 4 


2000 


37 bC 


5572 


/estlr*_ ^ i. / 


9 ft c 1 i 


2 3 5 


2001 


3 7 6 7 


5573 


lO/ri DO OTP 


7866 


2 3 6 


2002 


378 1 


5574 


O il T bo 11 0 


7868 


22 7 


2003 


37 8 5- 


5575 


to jl no OOft 

/ 84 ^ I r*_ * <. V 


9 Q QC 


22 6 


2 004 


3 7 9 C 


5576 




789 8 


23 5 


2 005 


3 753 


ccT! 
33 / / 


7ftil PT D9 922 


7900 


2 20 


2006 


37 52 


557 8 


9 ft A P Y D9 991 


790 6 


22 j 




3 70 - 


5575 


7fldPtP2 224 


790B 


2 22 


2 00 8 


3 754 


5580 


7S4PTP2 22o 


7909 


2 2 


2009 


3 7 c, 5 


5582 


784CIP2 22fe 


791 7 


- 24 


202.0 


3 7 c fc 


558 2 


784CIP2 227 


7932 


22 5 


2011 


j 3 757 


5582 


784CIP2 228 


7940 


2 26 


2012 




5564 


784CIP2 225 


7940 




2013 


3 7 55 


5505 


784C1P2 230 


7984 


2 2 8 


2014 


3800 


5586 


784CIP2 231 


7984 


2 25 


2015 


38C: 


5587 


1 784CIP2 232 


8001 


230 


2016 


3802 


5588 


784C1P2_233 


8021 


2 33 


201 7 


3 8 03 


558S 


784CIP2 234 


8025 


232 


2018 


380< 


559C 


784C1 P2_235 


8033 


233 


2019 


3 8CL 


5591 


784CIP2 236 


8040 


234 


2626 


38 06 


5592 


784CIP2 237 


8052 


23 5 


2021 


3807 


5593 


784CIP2 238 


8096 


236 


2022 


38 08 


5594 


7S4CIP2_239 


8096 


237 


2023 


3805 


S595 


784CIP2_240 j 


8113 


238 


| 2024 


3810 


5596 


784C1P2 241 


8126 


235 " 


2025 


3823 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2,243 


8137 | 


242 


2027 


3813 


5599 


784CIP2 244 


8137 


24 7. 


2028 


3B24 


5600 


784CIP2J245 


8159 


243 


2029 


3815 


5501 


784CIP2^246 


8159 


244 


2030 


3816 


5602 


784CIP2^247 


8161 


245 


2031 


3817 


5603 


784CIP2_248 


8176 
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BNSDOCID: <W0 0153312A1_I_; 



4 



WO 01/533 J 2 



PCT/US00/34:63 



SKQ ID NO: 


seq in 


SEC H> NO: 


SEQ ID 


Priority 


SEQ U; : 


of full- 


NO: of 


ci contic 


NO 


docket, number _ 




length 


full- 


nuci eot ide 


ol rent ja 


corresponding 


U . S . S . N . | 


nucleotide 


lennth 


sequence 


pept. ide 


SEQ ID NO: in 


09/<iSG,72S i 


sequence 


pept ide 




sequence- 


priori ty 


j 




sequence 






eppl i cat. a on 


1 


246 


202; 


3 81 6 


5604 


78 4CIP2_24 5 


6Z96 j 


24-7 


2 0jj 


3 81 5> 


5605 


784C1P2_250 


6200 ] 


24 8 


2034" 


382 0 


5606 


784C1P2_251 


6212 | 


• "" 249 


2035 


3821 1 


5607 


7 84C3P2_252 


6220 1 


r 250 


2 03 b 


3822 


5608 


784CIP2_253 


6238 1 


, " 251 


2037 


3822 


5 6 09 


784C1 P2_254 


62 54 


] 252 


203 6 


3824 


5630 


784CIP2_2S5 


6255 


1 253 


203? 


3825 


5612, 


784CIP2_256 


8288 


1 254 


2040 


3826 


5612 


784CIP2_257 


62 96 


25S 

i— 


2045 


3827 


5613 


784CIP2_258 


6329 


256 


2042 


3626 


5614 


784CIP2259 


6362 


, 257 


2 04 3 


3825 


5615 


784CIP2_260 


6429 | 


258 


2044 


3830 


5616 


784CIP2_261 


6436 | 


! 259 


2 04: 


3832 


[ 5617 


784CIP2_262 


6448 I 


260 


204t 


383: 


5618 


784CIP2_263 


84 72 ; 


t 261 


2047 


3832 


5619 


784C1P2_264 


6502 


262 


204c 


3 834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2_266 


€507 : 


264 


2050 


3836 


5622 


784CIP2268 


6509 


265 


205: 


3837 


5623 


784CIP2_269 


8515 j 


266 


20S< 


383e 


5624 


784CIP2_270 


6519 


267 


2053 


3835 


5625 


784C1P2_271 


8530 j 


268 


2054 


384C 


5626 


7b4ClP2_272 


8522 


269 


205b 


3843 


5627 


784CIP2_273 


8532 


270 


2056 


3842 


5628 


| 7S^ClP2_274 


0539 


271 


2057 


3843 


5629 


784CIP2_275 


8541 


272 


205b 


3B44 


5630 


784C1P2_276 


854 3 


273 


2 05? 


3845 


5631 


764CIP2_277 


aS93 


274 


206C 


3846 


5632 


784C1P2_278 


8595 


275 


2063 


3847 


5632 


764CIP2_279 


6615 


276 


2062 


3848 


5634 


784CIP2_280 


8620 


277 


2063 


3849 


5635 


784CIP2_281 


862i 


278 


2064 


3850 


5636 


784C1P2_2B2 


8623 


f~ 279 


2065 


3851 


563 7 


704CIP2283 


8625 


f 280 


206t 


3852 


5638 


784CIP2_284 


8626 


281 


2067 


3853 


563? 


784CIP2 285 


86 2 6 


282 


2068 


3 8 54 


5640 


784CIP2_286 


862? 


283 


2069 


3855 


5641 


784CIP2_287 


0630 


284 


207C 


3856 


5642 


1 784C1P2_2B8 


8621 


285 


207: 


3857 


5643 


784C1P2 289 


8633 


286 


2072 


385B 


5644 


784CIP2_290 


8624 


287 


2073 


3859 


5645 


784CIP2_291 


8625 


288 


2074 


3860 


5646 


784C3P2_292 


8636 


289 


2075 


3861 


36 4 7 


784C1P2293 


665 9 


290 


207t 


3862 


5648 


784CIP2_294 


6660 


291 


207' 


3863 


5649 


784CIP2295 


8667 


292 


2076 


3864 


5650 


784CIP2J296 


8667 


293 


2C73 


3865 


5551 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2_298 


8805 


295 


2061 


3867 


5653 


784CIP2_299 


8896 


296 


2082 


3868 


5654 


784CIP2300 


8976 


297 


206 3 


3B69 


5655 


784CIP2_30l 


9046 


298 


n 2064 


3870 


S656 


784CIP2_302 


9046 


299 


2065 


3671 


56 57 


784C1P2 303 


9116 


300 


206( 


3872 


5658 


784CJP2_304 


9195 


301 


2067 


3873 


5659 


784CIP2_305 


9201 


302 


208c 


3874 


5660 


7B4CIP2__306 


9307 


303 


20e? 


3875 


5661 


784C1P2 307 


9321 


I 304 


20? 0 


3876 


5662 


7B4C3P2_308 


9397 


305 


2093 


2877 


5663 


r 784CIP2_309 


9405 


306 


2091 


3878 


5664 


784C3P2 310 


9406 


] 307 


2092 


3879 


5665 


784C1P2_311 


9422 
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BNSDOCI0: <WO 0153312A1 J_ > 



WO 0J/533J2 



PCT/US00/.U263 



SEQ ID NO: 
of full- 
length 
nucleot ide 
sequence 


SEQ IL 
NO: C'. 
full 

3 enoth 
pepti ce 
sequence 


SEQ ID NO: 
cf conti9 
nucleotide 
sequence 


SEQ II j Priority 
NC : 1 docket number_ 
ot contig ! corresponding 
peptics | SEQ ID NO: in 
sequence ] priority 

| explication 


^SKQ II. 
NO. in j 

09/488,725 i 

j 


30e j 2094 


3880 


5661 | 784CIP2_312 


54 94 \ 


309 


209S • 3861 


566" 784C1P2_313 


9512 j 


310 


209£ 3882 


566c 784CIP2 314 


9632 1 
1 


31 1 


2097 i 3883 


5669 784CIP2_31S 


9662 1 


312 | 2096 | 3884 


5670 784CIP2_316 


9664 | 


313 ; 209* | 3BB5 


5672" 784CIP2_337 


9691 | 


314 2100 


38e6 


5672 ! 784CIP2_31B 


970C | 


315 | 2101 


3887 


567", 


784CIP2_319 


971 1 


316 i 2102 


3888 


5674 


784CIP2_320 


9721 


317 j 2103 


3889 


5675 


784C1P2_321 


5870 


3.18 | 2104 


3890 


S67t 


784CIP2 322 


98B7 


319 j 2105 


3891 


5677 


784CIP2_323 


9923 ; 


320 j 2106 


3692 


5676 


784CIP2_324 


9938 


321 ! 210-7 


3893 


5675 


784CIP2_32S 


9964 


r ~ 322 210b 


3894 | 566 0 


784C1P2_326 


10007 j 


323 | 2109 


3895 


56 Gj 


784CIP2_32"7 


10009 f 


324 


2110 


3896 


5682 


784CIP2_328 


10046 


325 


2111 


3897 


568? 


784CIP2_329 


10156 


326 


21 12 


389e 


5684 


784C1P2_330 


10276 


327 


21 1:- 


3899 


56c?5 


784CIP2_331 


10283 


328 


2114 


390C 


5681 784CIP2B1 


152 


329 


2115 


3901 


5687 


784C1P2B_2 


167 


330 


21U 


3902 


568* 


7H4CIP2B_3 


205 


331 


211" 


3903 


5685 


784C1P2B_4 


210 


332 


211E 


3904 


569C 


784CIP2B_5 


22 V- 


1 333 


2115 


3905 


5691 


784CIP2B6 


226 


3 34 


2120 


3906 


5692 j 784CIP2B_7 


264 


335 


2121 


3907 


569? 


784C1P238 


266 


336 


2122 


3908 


5694 


784CIP2B 9 


293 


3 37 


2:23 


3909 


5691 


784C1P2B 10 


293 


336 


2:2<5 


3910 


S69L 


7B4CXP2B_11 


2 9? 


339 


2;2c 


3911 


5697 


784CIP2B_12 


30P 


340 


2128 


3912 


5696 


784C1P2B_13 


311 


341 


2127 


3913 


569S i 784CIP2B_14 


352 


342 


2128 


3914 


570C 


784C1?2B_15 


358 


343 


2125 


3915 


5703 


784CIP2B_3fc 


366 


344 


^131) 


3916 


5702 


784CIP2B_17 


393 j 


345 


213) 


3917 


5703 


>B4CIP2B_18 


477 ; 


! 346 


2132 


3918 


57D4 


7B4C1P2B_19 


506 i 


347 


2133 


3919 


5705 


784CIP2B_20 


506 ) 


348 


2134 


3920 


b70t 


784ClP2B_2i 


515 


3 4S 


2131 


3921 


5 707 


?84C1P2B_22 


576 


350 


2136 


3 922 


5706 


784CIP2B_23 


586 


351 


2137 


3923 


5709 


784C1P2B_24 


59: 


352 


2136 


3924 


5710 


784CIP2B_25 


592 


353 ! 2*39 


3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B27 


615 


355 


2143 


3927 


5713 


784CIP2B_28 


620 


3 56 


2142 


3928 


5714 


784CIP2B_29 


654 


357 


2143 


3929 


5715 


784C1P2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 f 


359 


2145 


3931 


5717 


784CIP2B_32 


756 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784C1P2B_34 


833 


362 


2148 « 3934 


S720 


784CIP2B_35 


635 


363 


2149 


3935 


5721 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


784CIP23_37 


891 


365 


2151 


3937 


5723 


784CIP2B_38 


851 


366 


2152 


3938 


S724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784C1P2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B_43 


932 


369 


2155 


3941 


S727 


784CIP2B_42 


942 j 
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BNSDOCID: <WO 0i&33l2A1_L> 



WO 01/53312 PCT/USOO/34263 



SEO ID NO: 


SKQ 31 


SEQ ID NO: 


SEQ ID 


Priority 


SEO ID 


of full- 


NO: of 


ot contig 


NO 


docket nu ruber _ 


NO : i n 


length 


ful} 


nuc3 eotide 


ci contig 


corresponding 


U. S.S .N 


nucl eotide 


length 


sequence 


peptioc- 


SEQ ID HO: in 


09/488, 725 


sequence 


pept 5 ri> 
sequence 




sequence 


priority 
application 




370 


21S( 


3942 


572E 


784C1P2B 43 


956 


371 


215' 


3943 


5725 


764CIP2B 44 


968 


372 


215*- 


3944 


5730 


784C1P2B 45 


9S2 


3 73" 


2155 


3945 


573; 


784CIP2B_46 


1025 


374 


216C 


3S46 


5735 


784CIF2B 47 


1074 


375 


2363 


3347 


5733 


784CIP2B_4B 


31 04 


376 


216i 


3948 


5734 


784C1P2B_49 


1114 


377 


2163 


3 94 9 


573 t 


784CIP2B_50 


1144 


37 8" 


2164 


3350 


5736 


784CIP2B_51 


12 62" | 


379 


2165 


3951 


5737 


784CIP2B_52 


131B 


380 


2166 


3952 


5736 


784CIP2B_53 


1315 


361 


216V 


3953 


5735 


784CIP2B_54 


1328 


382 


2168 


3954 


5740 


784CIP2B 55 


1436 


383 


216? 


3955 


5743 


784CIP2B_S6 


1464 


384 


21 70 


3956 


574i 


784CIP2B_57 


1584 | 


385 


2173 


3957 


574 3 


784CIP2B 58 


1617 j 


386 


217; 


3958 


5744 


784CIP2B_59 


1724 


387 


2173 


3959 


574* 


784CIP2B_60 


1726 


386 


2174 


3960 


5746 


7e4CIP2B 61 


1772 


389 


2175 


3961 


574 7 


7E4C1P2B 62 


1805 


390 


2176 


3962 


574f 


784CIP2B 63 


1868 


391 


2177 


3963 


5745 


764C1P2B__64 


1896 


392 


2176 


3964 


575C 


784CIP2B_65 


1926 


393 


2i?e 


3965 


5751 


784CIP2B_66 


1965 


394 


218(. 


3966 


575; 


784CIP2B_67 


1967 


395" 


2183 


3967 


5753 


784CIP2B_68 


19 95 


396 


218* 


396e 


5754 


784CIP2B_69 


2005 


397 


21 e:- 


3969 


5755 


784CIP2B_70 


2027 


398 


2184 


3970 


3756 


784CIP2B_7l 


2055 


399 


2181. 


3971 


57S7 


784C3P2B_72 


2103 


400 


2186 


3972 


5758 


784C1P2B_73 


2106 


403 


218 V 


3973 


575S 


784CIP2B_74 


2166 


402 


2iee 


3974 


5760 


784CIP2B_75 


2175 


4 03 


2189 


3975 


5761 


784CIP2B_76 


2176 


4 04 


219C 


3 976 


5762 


784CIP2B_78 


2236 


405 


2193 


3977 


5763 


784CIP2B_79 


2250 


| 406 


219; 


3978 


5764 


784CIP2B_»0 


2300 


4 07 


2193 


3979 


5765 


784CJP2B_81 


2323 


408 


2194 


3980 


5766 


784CIP2B_82 


234 0 


4 OS 


21 95 


3983 


5767 


784C1P2B_83 


2373 


410 


2196 


3982 


5766 


784CIP2B_84 


i 2399 


411 


2197 


3983 


5769 


784CIP2B_65 


2411 


412 


2196 


3984 


5770 


784CIP2B_86 


2428 


413 


2195 


3985 


5771 


784CIP2B_87 


2430 


414 


220C 


3986 


5772 


784CIP2B_88 


2435 


415 


2201 


3987 


5773 


784CIP2B_89 


2447 


416 


220; 


3986 


5774 


784CIP2B 90 


2461 


417 


2203 


39B9 


5775 


784CIP2B_91 


2487 


416 


2204 


3990 


5776 


784C3P2B_92 


2492 


419 


2205 


3993 


5777 


784C1P2B_93 


2512 


| 420 


2206 


3992 


5778 


784C1P2B 94 


2564 


421 


2207 


3993 


5779 


784C1P2B__95 


2678 


422 


2206 


i 3994 


5780 


784Cli , 2B_96 


2 816 


423 


2209 


3995 


5783 


784CIP2B_97 


2818 


424 


221C 


3996 


5782 1 


784CIP2B_98 


2819 


425 


2213 


3997 


5783 j 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CIP2B_100 


3137 


427 


2213 


3995 


5785 


784CIP2B_l0l 


3137 


428 


2214 


4000 


£786 


784CIP2B_102 


3160 


j 429 


221' 


4001 


5787 


784CIP2B_103 


3323 


430 


| 2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B_105 


3362 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/53312 



PCT/l SOD/34263 



O /""> "T Vf/""\ 

bh\; I D NU : 


SEQ 3D 


bty XV IHK- ■ 


SEQ J D 


Px x ori ty 


SEQ 11 


OI iUli' 


NO of 


ot contic 




docket number 


NO : i n 




f lil l - 


nucleotide 


ci ccntic 


correspond 1 nc 


11 C Q K" 


nucleotide 


j e ngt K 


sequence 


peptice 


SEQ ID NO: in 


09/4R8 

V/ ^7 / "1 {_> O / ' S~ — ' 


secruence 


per>t ide 
se cjuence 




sequence 


priori ty 
appl icat ion 




43 2 


2218 


4 004 


£790 


784CIP2B_1 Ofc 


34 3 '. 


432 


2215- 


4005 


5791 


784CIP2B_107 


341* 


434 


2220 


4006 


5792 


784CIP2B_10b 


344: 


435 


2221 


4007 


57 93 


784CIP2B_109 


344: 


436 


2222 


4008 j 5794 


76',CIP2B 110 


344< 


4 3"/ 


2223 


4009 


5795 


784CIP2B_131 


3 85! 


436 


2224 


4010 


5796 


784CIP2B_312 


3 86 :■ 


435 


2225 


4011 


5797 


784C1P2B_133 


409i 


440 1 


2226 


4012 


5758 


784CIP2B ~114 


410: J 


441 


2227 


4013 


S79S 


704CIP2B_135 


414: 


442 


2228 


4014 


S80C 


784C1P2B_136 


414; 




443 


2229 


4015 


5801 


784CIP2B_137 


4145- 


444 


2230 


4016 


5802 


784CIP2B_338 


419( 


445 


2231 


4017 


5603 


784CIP2B 339 


420V 


4 4 6 


2232 


4036 


5804 


7B4CIP2B 120 


4274 


447 


2233 


4015 


5805 


784C1P2B 123 


4304 


448 


2234 ' 


4020 


5806 


784CIP2B 122 


4301 




44 S 


2235 


4 021 


5607 


784CIP2B 123 


431". 


450 


2236 


4022 


5806 


784CIP2B 12*. 


432. 


451 


223 7 


4023 


5605? 


784C3P2B 325 


43 21* 


452 


2238 


4024 


5810 


784CIP2B 326 


4332 


4 5>3 


223 9 


4025 


5811 


784C3P2B 327 


4 4 8 b 


454 


224 0 


4 026 


581 2 


784C3P2B 128 


4S8t 


4 55 


2241 


4027 


581 3 


784CIP2B 12? 


556 ir 


456 


2242 


4026 


581 4 


130 


55 73 






4029 


583 S 


?fl/inp?R in 

rO'iS-Xtr^ij X J X 


557 7 


4 58 


2244 


4030 


581 6 


fl?t\,ir 4LO XJ4 


557 - 


459 


"> ">A ^ 
4. £■ 9 J 


4033 


5817 


/ o *» \_ j. Jr a. D x ~i o 


55 8s 


4 60 


2246 


4032 


sei 8 


7 Rirt P"?R 1 

f UtV^I* fcD X J "! 


558 "* 


463 


2247 


4033 


S819 


784C1P2B 13*-, 


5584 


4 62 


2248 


4034 


5820 


7B4CTP2R J3f- 


558: 


4 6 3 


2249 


4035 


582: 


784C1P2B 137 


5593 


4 6 4 


2250 


4036 


582 2 


784C1P2B 136 


55 93 


465 


2251 


4037 


5823 


7B4CIP2B 13S 


5594 


466 


2252 


4038 


582 4 


764C1P2B 140 


5594 


46" 


2253 


4039 


582 5 


784C3P2B 143 


5596 


468 


2254 


4040 


5826 


784C3P2B 142 


56 02 


465 


2255 


4041 


582 ; 


/84C3P2B 14 i 


56 0L 


470 


2256 


4042 


5828 


784CIP2B 144 


5606 


471 


2257 


4043 


5829 


7B4C1P2B 14b 


5617 


472 


2258 


4044 


5830 


784C1P2B146 


5620 


473 


2259 


4045 


5833 


7 84C1P2B14 7 


5622 


4 74 


2260 


4046 


5832 


784CIP2B146 


5623 


475 


2261 


4047 


5833 


784C3P2B_14S 


5624 | 


476 


2262 


404e 


5834 


784CIP2B_150 


5625 


477 


2263 


4049 


5835 


784CIP2B_153 


5627 


478 


2264 


4050 


5836 


784CIP2B_152 


5628 


475 


226S 


4053 


5837 


784C3P2B 153 


5630 


480 


2266 


4052 


5838 


784CIP2B_154 


5632 


481 


2267 


4053 


5835 


784CIP2B__155 


564C 


482 


2268 


4054 


5840 


784C1P2B 156 


5643 


483 


2269 


4 055 


5841 


784C3P2B_157 


5643 


464 


2270 


4056 


5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B_159 


5645 


486 


2272 


4058 


5844 


784CIP2B_160 


S65E | 


487 


2273 


4059 


5845 


784CIP2B 161 


5655 


488 


2274 


4060 


5846 


784C1P2B 162 


"" 5667 


485 


2275 


4061 


5847 


784CIP2B 163 


5672 


490 


2276 


4062 


5846 


784CIP2B 164 


5674 




2277 


4063 


5845 


7B4CIP2B 165 


5675 


492 


2278 


4064 


5BS0 


784CIP2B_166 


5680 


493 


2279 


4065 | 


5351 


7B4CIP2B 167 


" S684 
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BNSDOCID. <WO__0153312A1_I_> 



WO 01/53312 



PCT/USOU/34263 



SEQ ID NO: 


SEQ ID 


SEQ. ID NO: 


5EQ ID 


Priority 


r<E0" ID 


of fuI3 ■ 


NO; of 


of contic? 


NO : 


docket numbe:_ 


NO: in 


length 


ful !i - 


nucl eotidc- 


cl contig 


correspondinc 


i: s.s.n. 


nucleotide 


length 


sequence 


pept ide 


SEQ ID NO: it t 


C 5/488, 725 \ 


sequence 


pept 5 de 
secruence 




sequence 


priority 
a ppl i c3 1 i on 




4 9 *\ 


228C 


4 06 1> 




7flzrT P?R 1 £ P 


56 86 


4 9 5 


*.£.*} 1 


H Ud .' 


5 853 


7R4CTP7R If o 


ecu 


4 9i 


22B"2 


4 06 8 


3D Ji 


7H4CT P7R 1 1() 


c/ op 1 


4 9* 


/ * o J 


406 ° 


585S 


1P&C1 P2B 1 ~, * 




4 9 £ 


O "5 Q4 


a m n 






5712 


4 9 S' 


2285 


4071 


^857 


7fl4riP7B 17" 


D / 1 -? 


500 


22B6 


4072 




/OHV.lriD J. 1 h 


r. nTr\ 
3 / Z L# 


SOU 


* so / 


^ u / J 




IRACl P7Q 1 7C 




SCI 


2288 


4 074 


c ft £ n 


f tr CO l*t 


5 73 0 


503 


2269 


4 U / D 


tool 


/ 0SL.1 xr £t> 1 / > 


5 734 


504 


2290 


4076 


5662 


/o4Cl P2B^1 /c 


5738 


50b 


22SJ1 


4077 


5 863 


TO/i^imn t *> ti 
/HflClP^o^l 


5739 


506 


2292 


4078 


5864 


/U4C1P2B__1 ol> 


574 0 


50' 


2293 


4075 


5665 


764CIP2B_1 8u 


5744 


50f 


2294 


4080 


5666 


784ClP2B^18i 


574 8 


505 


2295 


4081 


5867 


784ClP2B_18j > 


574 9 


510 


2296 


4082 


5668 


784CIP2B_104 


5750 


51j 


2297 


4083 


5e69 


7B4C1P2B_18! 


5750 


51^ 


2298 


4084 


5870 


7S4CIP2B_18t 


5750 


51 


2299 


4085 


5B71 


7B4C1P2B_18'< 


5761 


514 


2300 


4086 


5872 


7 84CIP2B_iet 


5762 


515 


| 2301 


4087 


5873 


704CIP2B_18f: 


| 5767 


510 


23 02 


4086 


5874 


784CIP2B_1SC 


!~ 5773 


517 


2303 


4085 


5875 


784C1P2B 19J 


5783 


518 


2304 


4090 


5876 


784ClP2B_19i 


5784 


515 


2305 


4091 


5877 


7 84CIP2B_193 


5788 


52C 


2306 


4092 


5878 


7 8 4CIP2B_194 


5798 


521 


2307 


4 093 


5879 


784CIP2B_19t 


5807 


52i 


2308 


4094 


5880 


7 84C1P2B_197 


5818 


523 


2309 


4095 


5861 


784CIP2B_19P 


5815 


524 


2310 


4096 


see2 


784CIP2B_195 


5827 


521 


2311 


4097 


5883 


7 84CIP2B_20C 


5828 


526 


2312 


4098 


5884 


7 84CIP2B_201 


5842 


527 


2313 


4095 


5885 


784CIP2B_202 


5853 


526 


2314 


4100 


5886 


7 84CIP2B_203 


5861 


525 


2315 


4101 


5887 


7 84C1P2B_204 


5864 


530 


2316 


4102 


5888 


7 84CIP2B_2 0S 


5865 


531 


2317 


4103 


58B5 


7 84C1P2B__206 


5871 


532 


2318 


4 104 


5890 


784CIP2B 207 


5873 


533 


2319 


4 105 


5B91 


1 Qjl^triOTl one 

/84CIP2B 20c 


5B73 


534 


2320 


4 106 


5892 


7 B4CIP2B^Z0t 


5875 


535 


23 21 


4 107 


5893 


/ o^Llir^O /1U 


3d / o 


536 


2322 


4 108 


5894 


*7 f> Afi mn 01 '. 


5879 


3J / 


2323 


4 10S 


CO OC 


/ CS^.J.A'Z2>__ii J. < 


5880 


c 

pJ o 


2324 




5896 


7 0 ATI BTR 1 ^ 
/ a ** LirzD_Z i ^ 


58 80 




2325 


4111 


58 97 


/ O *t X Jr < 0_* J *t 


58 80 




2326 


Ail"? 


CQQC 


7ft4PIP9ft 71 c 


58 80 




A J A f 


ill) 
s 11 J 


CQOQ 


/OILlrZO /■ A o 


5885 




ft 

* J 4 o 


4 1 1 A 


J7VU 


7B4rTP9"R 917 


5895 


54 j 




4115 


59 01 




5898 




£. J J\J 


4116 








CA C 




411/ 






5904 


546 


2332 


4118 


5904 


784CIP2B 221 


5918 | 


547 


2333 


4119 


5905 


784CIP2B_22i 


5921 


545 


2334 


4120 


5906 


784CIP2B 223 


5927 


545 


2335 


4121 


5907 


784CIP2B 224 


5932 


55C 


2336 


4122 


5908 


7 84CIP2B_225 


5935 


551 


233? 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B_227 


5946 


553 


2339 


412S 


5911 


784CIP2B 22 e 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B_230 


5967 
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BKSDOC1D <WO 0153312A1J_> 



WO «n/>3312 PCT/US0l)/34263 



"SKO'lDNO: 


SZQ IP 


SEQ. ID NO: 


SEQ ID 


Prior 3 t y 




ot iuiL- 


NO: ol 


of cent 19 


NO 


docke 1 r.umbc r_ 


NO : i n 


i t no t h 


full- 


r.ucl c-ot ide 


of contig 


corresponding 


U . S . S . N . 


nv. jiQCt irie 


3 engt h 


sequence 


pept i de 


SEQ ID NC: in 


05/488,725 


sequence 


peptide 
sequence 




sequence 


prior 1 1. y 
appl 1 c f t ion 




SSl 


2342 


4126 


5914 


784CIF2B_232 


59 Vb 




2343 


4125 


5935 


784CIP2B_233 


5977 




2344 


4 130 


5916 


784CIF2B 234 


5978 




2345 


413'- 


5917 


784CIF2B^235 


5979 


560 


2346 


4132 


5918 


784CIF2B^236 


5980 


56: 


2347 


4133 


5919 


784Cir2B 237 


ssee 


56^ 


2348 


4 134 


5920 


784C1F2B_238 


5985 


561- 


2349 


4135 


5921 


784C5 F2B_239 


599i 


564 


2350 


4 136 


5922 


784CIF2B^240 


[ S997 


565 


2351 


4137 


5923 


784CIr2B_241 


5998 


56 f 


2352 


4138 


5924 


784CIF2B_242 


6003 


f 56'/ 


2353 


4135 


5925 


784C1F2B_243 


6004 


56fc 


2354 


4140 


5926 


784CIF2B 244 


6013 


1 565 


2355 


4141 


5927 


784CI~F2B_245 


6028 


570 


2356 


4142 


5926 


784C1F2B 246 


602B 


572 


2357 


4143 


5925 


784C1F2B_247 


602 5 


57; 


235b 


4144 


5930 


784CIF2B_248 


6031 


572 


2359 


4145 


5931 


784C1F2B_249 


6031 


574 


2360 


4146 


5932 


784CIF2B_250 


6032 


57i 


2363 


4147 


5933 


784CIF2B_251 


6037 


57€ 


236? 


4148 


5934 


784C1F2E^252 


6037 


57-/ 


2363 


4145 


5935 


784C3F2B_253 


6043 


576 


2364 


4150 


5 93 6 


784C1F2B_254 


' 6044 


575 


2365 


4 153 


5937 


784CIF2B_255 


6046 


5GC 


2366 


415£ 


5938 


784CIP2B_256 


6048 


58; 


2367 


4153 


5939 


784C3P2B__257 


6049 


56 V 


2368 


4154 


5940 


784C1F2B 258 


6051 


58.; 


236$ 


4155 


594 2 


784CJF2B 259 


6055 


564 


2370 


41S6 


5942 


784C1 P2B_260 


6060 


585 


2371 


4157 


5943 


784C2P2B_261 


6063 


b8t> 


237U 


415* 


5944 


784C1P2B_262 


606* 


587 


2373 


4159 


5945 


784C3P2B 263 


6067 


58? 


2374 


4160 


5946 


784CIF2B_264 


60$t 


505 


2375 


4161 


5 94 7 


784CJP2B_265 


6073 


59C 


2376 


4162 


594 8 


784CIP2B_266 


6076 


591 


2377 


4163 


5949 


784C1P2B_267 


6076 


592 


2376 


4164 


£950 


784C1P2B_268 


6077 


595 


2375 


4165 


5951 


784CIP2B_269 


6075- 


594 


238C 


4166 


5952 


784C1F2B_270 


6082 


595 


2381 


4167 


5953 


784CIF2E_272 


6088 


596 


. 2382 


4166 


5954 


784CIP2B_273 


6093 


597 


2303 


4165 


5955 


784C1P2B_274 


6094 


596 


2384 


4170 


5956 


784CJF2B 275 


6101 


599 


2385 


4171 


59S7 


784C1P2B276 


6103 


660 " 


2386 


4172 


595fl 


784ClP2B w 277 


6104 


60} 


2387 


4173 


5959 


784CJF2B278 


6108 


6 02 


2388 


4174 


5960 


764C1P2E 279 


6112 


1 603 


2385 


4175 


5961 


784C1P2B 280 


6121 


^04 


2390 


4176 


5962 


7$4CJP2B_281 


6125 


605 


2391 


4177 


5963 


784CIP2fi_282 


6126 


606 


2392 


4178 


5964 


784CIP2B 283 


6128 


607 


2393 


417S 


596S 


784CIP2B 284 


'6129 


608 


2394 


4180 


5966 


784C1P2B285 


6133 


609 


2395 


4181 


5967 


784CIP2B 286 


6133 


61C 


2396 


4182 


5968 


764dP2B_287 


6135 


612 


2397 


4183 


5969 


784CI?2B_288 


6139 


612 


2398 


4184 


5970 


784C1F2B 289 


6141 


613 


2399 


4185 


5971 


7B4CIP2B_290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5 973 


784CIF2B_292 


6146 


616 


2402 


4188 


5974 


784CIF2B 293 


6149 


617 


24C3 


4189 


5975 


784C1F2B_294 


6149 
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SEQ ID KG: , 


SEC? ID 


SEQ ID NO 


SEQ ID 


Priori ty 


SEC ID 




NO: of 


of contig 


NO: 


docket nutr.be r_ 


NO : i n 


ien?; r: 




nucleotide 


c.' contig 


correspond inc 


U.S. S.N. 


iiuci c-ot i de 


lenQt h 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priori ty 
appl i cation 




616 


2404 


419: 


5976 


784C1F2E_295 


6153 


G19 


240S 


4 IS* J 


5977 


784C3P2B_296 


61S9 


62C 


2406 


4192 


5976 


784C1P2B 297 


6164 




2407 


4193 


597S 


784C1F2B_298 


6167 


622 


2408 


43 94 


5980 


764CIF2B299 


6172 


623 


2409 


41 9 r _ 


S981 


784C1P2B_300 


6173 


6 24 


2410 


4196 


5982 


784CIP2B_ 301 


6190 


^ 62b ' 


2411 


4197 


5983 


784C1P2E 302 


6194 


626 


2412 


4190 


5 984 


784C3P2£_ 303 


6196 


6 27 


2413 


419 9 


5985 


784CIP2B304 


6197 


c2fc 


2434 


4 2 00 


S986 


784C1P2B305 


6198 


629 


2415 


4202 


5987 


784CIP2P 306 


6138 


630 


2416 


4202 


5388 


784CIP2B_3C8 


6214 


631 


2417 


4203 


5989 


784CIF2B309 


6215 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


633 


2419 


4205 


5991 


784C1P2B 311 


6226 


f 634 


2420 


420( 


5992 


784C1P2B.312 


6229 


635 


2421 


4207 


5993 


784CIP2B 313 


6234 


636 


2422 


4208 


5994 


784CIP2fi_:U4 


623"7 


637 


2423 


420S 


5995 


784CIP2B_31b 


6238 


6 36 


2424 


4210 


5996 


784C1P2B_316 


6239 


6 3S- 


2425 


4213 


5997 


784C1P2B_317 


6233 


640 


2426 


4212 


S998 


784C1P2B_ 318 


6239 


641 


24 2 7 


423 3 


5999 


784C1P2E_31 9 


6240 


042 


2428 


4214 


6000 


784CIP2B 320 


6244 


643 


2429 


421 5 


6001 


784C1P2B 321 


6245 


644 


2430 


4226 


6002 


784CIP2E_322 


6250 


645 


2431 


4217 


6003 


784CJF2E_ 323 


6252 


| 646 


2432 


4219 


6004 


784CIP2B324 


6252 


1 647 


2433 


4219 


6005 


784CIP2B325 


6256 


, 64e 


2434 


4220 


6006 


784CIP2P_326 


6260 


i 649 


2435 


4221 


6007 


78 4 CI P2E 327 


6261 


, 65C 


2436 


4222 


6008 


784CIP2P 328 


6264 


j 651 


^ 2437 


4 223 


6009 


784C1P2B 329 


6265 


! 652 


2430 


4224 


6010 


764CIP2P330 


6266 


653 


2439 


4 22 5 


6011 


7&4C1P2B_331 


6270 


i 654 


2440 


4226 


6012 


784C1P2E_332 


6*271 


65^ 


2441 


4227 


6013 


784C3P2E_334 


6274 


656 


2442 


4226 


6014 


7B4C1P2B_ 335 


6276 


i 657 


244 3 


4229 


6015 


784CIP2B_336 


62 81 


658 


2444 


4230 


6016 


784CIF2B_3 37 


6281 


6ES 


2445 


4231 


6017 


784C1P2B_338 


6288 


660 


2446 


4232 


6018 


784CIP2B_339 


6292 


661 


2447 


4 233 


6019 


784C1P2B 340 


6294 


662 


2446 


4234 


6020 


784CJP2B 343 


6312 


66 3 


2449 


4235: 


6021 


784CIP2B344 


6312 


1 664 


2450 


423t 


f 6022 


784CJP2B 345 


6312 


* 665 


2451 


4237 


6023 


784C1P2B346 


6322 


666 


2452 


4238 


6024 


784C1P2B347 


6324 


1 "o 


2453 


4239 


6025 


784CIP2B349 


6329 


1 668 

< — 


2454 


4240 


6026 


784CIP2B 350 


6331 


! 669 


2455 


4241 


6027 


784CIP2B_351 


6333 


670 


2456 


4242 


6028 


784C1P2B 352 


6334 


671 


24 57 


4243 


6029 


784CIP2B_353 


6337 


672 


2458 


4244 


6030 


784CIP23 354 


6339 


673 


2459 


424S 


6031 


784C1P2B_355 


6346 


( 674 


2460 


4246 


6032 


784C1P2B_356 


6348 


675 


2461 


4247 


6033 


784C1P2B_357 


* 6348 


676 


2462 


4248 


6034 


784C1P2B 3S8 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784C1P2B_360 


6355 


679 

l 


2465 


4251 


6037 " 


784CIP2B_361 


6362 
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WO 01/53312 



PCTAJS(M)/3426J 



~~£EQ "T5 NO?" 


SECTlL 


SEQ ID NO : 


SEQ ID 


Prior a Ly 


SEO ID 1 


of full- 


NO: of 


ci cent a c 


NO: 


docket number^ 




length 


full- 


fjucI eo t i de 


of contig 


cor r e sponai ng 


U . S . S . N . 


nuci eot i cie 


length 




pt.p 1 Utr 


*:cf\ TT) Mr* - ir» 




s g cju e n c6 


pc £> t 1 0^" 
secjutincf 






p3T 1 OX"i t\ 
appl a cat i on 




6 6 0 


24 St 


4252 


6036 


7B4CIP2E 362 


636& 


661 


246' 


4 25: 


6039 


764CIP2B 363 


636^: 


682 


246f 


425* 


6040 


7&4CIP2E 364 


6371 


6R 3 


246 < 


4 251: 


6041 


784CIP2B 365 


6376 


684 


247C 


42S6 


6042 


7&4CIP2B 366 


63 75 


685 


247j 


4257 


6043 


7fc4CI?2B 367 


63 80 


6 86 


24 72 


4256 


6 04 4 


784CJP2B 3 68 


63 83 


6 B7 


2473 


4 259 


604 5 


7B4C1P2B 369 


6392 


688 


2474 


426C 


6046 


7G4CIP2B 370 


63 95 


6 8 5 


24 7'* 


4 261 


6 04 7 


7B4CIP2R 371 


63 97 


6 90 


2 4 76 


4 262 


604 8 


784CIP2B 372 


64 0C 


O Jj 


2 4 77 


4 263 


604 9 




64 01 


692 


24 76 


4 26-* 


6 050 


7 na n P9Pi 114 


6411 




24 7 C 


4265 


6051 




64 11 


69 4 


24 80 


4 266 






6411 




2 4 8 _ 


s > 


6 0 53 




6416 


fa 9 b 


2 4 82 


4 JOC 






£4 "1 ft 


697 


24 83 


4 2 6f 


6 055 


/ 0 H 1 r 1 1> 3 / J 


OH 4 *. 


69 8 


2 4 6-4 


4 27 0 


6056 ' 






699 


24 Bt 


4 271 


6 05 7 






70C 


24 86 


4272 


6058 


To x ^7 nio Too 

/84CIP2B_382 


04 27 


701 


24 87 


4 273 


C 059 


/84CIP2B JflJ 


64 2 6 


702 


24 88 


4274 


606 0 


784CIP2B_384 


64 2 9 


703 


24 89 


4 27?r 


6061 


/84LJP23__3 85 


64 3 0 


704 


24 90 


4 276 


6062 


784CIP2B_3 86 


64 3 2 


70S 


24 91 


4277 


6063 


784CIP2B_387 


64 3 2 


706 


2492 


427b 


6064 


784C1P2B_388 


64 3 8 


707 


2493 


4279 


6065 


784C1P2B_389 


64 41 


708 


249<» 


4 280 


6066 


784C1P2B_390 


64 4 6 


70S 


249- 


4 281 


60 6 7 


/ 84 CIP2B_391 


64 54 


710 


24 9f 


4 282 


606 6 


784C1P2B_3 92 


64 55 


71 1 


2 4 97 


4 283 


60 6 9 


7 84t-lP2B_3 94 


64 6 1 


712 


24 96 


4 284 


60/0 




£Td C "7 

b / ; 


713 


24 9? 


4 285 


6071 


784 C J P2n_.3 9© 


64 6 8 


72 4 


2500 


4 286 


6072 


7B4CJ P2B_J97 


64 87 


715 


250 i 


4287 


c r\i 
ou / j 


/OHwlr* 0 J 7 0 




716 


2 5 02 


4 286 


6074 


TQinon loo 


6506* 


717 


2 S03 


4 2 89 


OU / D 




6514 


/It 




4 2 9 C 


6076 




6515 


71S* 




4 291 


6077 




6521 




^ JUO 


A 1 Q«- 


60 7 6 




•6532 


721 


2507 


4293 




784C1P2B 405 


6536 


722 


2 508 


4294 


6080 


7fi4riP2R 406 


6543 


723 


2 509 


4 295 


5081 


784CIP2B 407 


6544 


724 


2510 


4 296 


6082 


784CIP2B 408 


654 8 


725 


2 511 


4297 


6083 


784CIP2B 409 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6553 


727 


2513 


4299 


6085 


784CIP2B 411 


6552 


728 


2514 


4300 


6086 


7&4C1P2B 412 


6554 


725 


2515 


4 3 01 


6087 


784CIP2B 413 


6556 


730 


251 6 


4302 


6088 


784CIP2B 414 


6 560 


731 


2517 


4303 


6069 


784CIP2B 415 


6563 


732 


2518 


4304 


6090 


784CIP2B 416 


6564 


732 


2519 


4305 


6091 


784CIP2B 417 


6567 


734 


2520 


4306 


6092 


784CIP2B 418 


6573 


735 


252 3 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B 420 


6577 


737 


2523 


4309 


6095 


784CIP2B_4 21 


6593 


738 


2524 


4310 


6096 


784CIP2B_422 


6595 


739 


252 S 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B_424 


6625 


741 


2527 


4313 


6099 


784 CI P2B_425 


6625 
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SEQ ID NO: 


SEQ } I 


SE0 ID NC: 


SEQ ID 


Priority 


SEQ IE- 


of full- 


NO: <?5 


of con-viq 


NO : 


docket number_ 


NO: in 


length 


full- 


nucleotide 


ot ccntig 


cor res ponding 


v.s.s.n 


nucieot ide 


:engtr. 


sequence 


pept idr 


SEQ 3D NO: in 


09/488,725 


sequence 


pept ice 
sequcr.ee 




sequence- 


priori ty 
application 


1 


742 


252, 


43 J4 


6 i 00 


1 784C1P2B 426 


662( 


743 




4315 


6101 


| 764C1P2B_427 


663C l 


744 


253 ( 


4316 


6102 


| 784CIP2B_426 


6631 


745 


253 -i 


4317 


6103 


784CIP2B 429 


6632 


746 


253^ 


43l8 


6104 


784CIP2B 43 0 


* 6633 


747 


2 53 3 


431S 


6 ICS 


784C1P2B 43 1 


6634 


748 


2 53<i 


4320 


6106 


784CIP2B_432 


6636 


749 


253 i 


4321 


6107 


784C1P2B_433 


664) 


750 


2 53< 


4322 


6108 


784CIP2B_434 


6644 


751 


253* 


4323 


6109 


7e4ClP2B_435 


| 6646 


752 


2531 


4324 


6120 


| 784C1P2B_436 


6648 j 


753 


253 f 


4325 


6111 


| 764CIP2B_437 


6652 


754 


254 0 


4326 


6112 


784CIP2B 436 


6654 


75S 


254: 


4327 


6113 


| 784CIP2B_439 


6657 


756 


254; 


432B 


6114 


1 784C1P2B_440 


6656 


757 


2 54 3 


4329 


6115 


[ 784CIP2B441 


6663 


758 


2544 


4330 


6116 


784C1P2B_442 


6664 


759 


2545; 


4333 


6117 


7 84C1P2B 44 3 


6666 


760 


254t 


4332 


6116 


784C1P2B_444 


6665> 


761 


254 7 


4333 


6119 


784CIP2B_445 


6673 


762 


254 h 


4334 


6120 


784C1P2B_446 


6685 


763 


2 54 S 


4335 


6121 


784C1P2B_447 


6687 


764 


255C 


4336 


6122 


704ClP2B_44fc 


6689 


765 


2551 


4337 


6123 


784C3P2B_44S 


6693 


766 


25S2 


4336 


6124 


784C1P2B_450 


6696 


767 


2551 


4339 


6125 


784C1P2B_451 


6699 


768 


2SS<, 


434C 


6126 


784C3P2B_452 


6705 ; 


769 


2551 


4343 


6127 


784CIP2B_453 


6711 


770 


2 55'. 


4342 


6126 


784CIP2B_454 


6713 


771 


255"- 


4343 


6129 


784C1P2B_455 


6716 


777. 


255t 


4344 


6130 


784CIP2B_456 


6725 


773 


25SE 


4345 


6131 


784C1P2B_4S7 


6726 


774 


2S6C 


4346 


6132 


7B4CIP2B_-458 


6727 


775 


256 i 


4347 


6133 


784C3P2B_459 


6730 


776 


2562 


4348 


6134 


764C1P2B_460 


6730 


777 


256 3 


4349 


612S 


784CIP2B_46i 


6730 


778 


2564 


4350 


6136 


7B4C1P2B_462 


6732 


779 


256b 


4351 


6137 


78 4ClP2B_4b3 


6733 


780 


2S6fc 


43S2 


6136 


784CIP2B_464 


6737 


781 


2 56 7 


4353 


S13S 


784CIP2B_465 


674i 


"7 82 


256*- \ 


4354 


614C 


784CIP2B_466 


6751 


763 


2 50 5 


4 3 55 


6141 


784CIP2B 467 


6754 


784 


2570 


4356 


6142 


784C1P2B_468 


675B 


785 


257] 


4357 


6143 


784CIP2B_4 69 


6761 


786 


2572 


4358 


6144 


784CIP2B_470 


6765 


787 


2573 


4359 


6145 


784C3P2B_471 


6768 


788 


2574 


436C 


6146 


7 84CIP2B_4 72 


6773 


789 


2575; 


43 61 


6147 


784CIP2B_473 


6776 


790 


257fr 


43 62 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


6796 


792 


. 2576 


4364 


61 50 


7B4CIP2B_476 


6823 


793 


2575/ 


4365 


6151 


784CIP2B 477 


6825 






43 66 


b ±DZ 






795 


258: 


4367 


6153 


784CIP2B_479 


6835 


796 


2582 


4368 


6154 


784CIP2B_4 80 


6844 


797 


2583 


4369 


6155 


784C1P2B_4 82 


6849 


798 


2584 


4370 


6156 


784CIP2B_483 


6854 


799 


2565 


4371 


6157 


784CIP2B_4B4 


6857 


800 


258t 


4372 


6158 


784CIP2B_48S 


6861 


801 


2567 


4373 


6159 


784CIP2B 486 


6873 


802 


258 6 


4374 


6160 


7B4CIP2B 487 


6875 


803 


2S6S- J 


4375 


6161 


784C1P2B__4 88 


6877 
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SEC 3D NO: 


SEQ It 


SEQ ID NO: 


SEQ 3C 


Pxi ori ty 


i-EO ID" 


of full- 


NO: oi 


ot contig 


NC : 


docket nunvber_ 


NO: in 


length 


full- 


nucleot idfe 


of ccntig 


ccr r er.pondi ng 


U . S . S . N 


nucleot ide 


length 


sequence 


pep tide 


SEQ ID NO: in 


05/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




eoi 


2590 


437fc 


6162 


784C1P2B_489 


688C 


80S 


2593 


437", 


616 :■ 


784CIP2B_490 


68e5 


806 


2592 


4376 


6164 


784CIP2B_491 


6890 


807 


2593 


4375 


6I6t 


7B4CIP2B_492 


6890 


806 


2594 


4380 


6166 


734CIP2B_493 


6894 


ec? 


259b 


438". 


6io: 


78 4CIP2B_4 94 


69 0; 


820 


259? 


4382 


6166 


784CIP2B_495 


6504 


811 


2597 


4 3 63 


616b 


784CIP2B__496 


6507 


812 


2596 


4384 


6176 


784CIP2B_497 


6914 


813 


2595 


43eb 


617: 


784CIP2B_498 


6917 


814 


2600 


4 38C 


6172 


784CIP2B_499 


6523 


815 


2602 


4387 


517? 


7B4CIP2B_500 


692 5 


81b 


2602 


4388 


6174 


784C1P2B 501 


6931 


81'/ 


2603 


4 3 85 


6171 


784CIP2B_502 


6935 


816 


2604 


439C | 6176 


7B4CIP2B_503 


6940 


815 


2605 


4391 


5177 


784CIP2B 504 


L 694 ~ 


82C 


2606 


439; 


61'/ f- 


784CIP2B_505 


1 6946 


921 


2 60-, 


4 3 93 


6175 


784CIP2B_506 


694'. 


822 


2608 


4394 


638C 


784CIP2B_507 


6945 


823 


2605 


439b 


6181 


784CIP2B 508 


6955 


824 


2610 


4396 


6182 


784CIP2B_569 


6960 


825 


2611 


43 9'/ 


6183 


784C3P2B_51D 


6962 


826 


2612 


4398 


616'; 


784CIP2B_S11 


1 6963 


827 


2613 


4399 


618^ 


784CIP2B_512 


6967 


826 


2614 


4400 


61B6 


784C1P2B_513 


6 983 


825 


2615 


4402 


6137 


784CIP2B_514 


6988 


B3C 


2616 


4 4 02 


6186 


784CIP2B_515 


699C 


831 


2617 


4403 


6195 


784C1P2B_516 


7 003 


632 


2616 


4404 


61 90 


764CIP2BS17 


7016 


833 


2619 


4 4 05 


619:, 


784CIP2B_518 


7017 


834 


2620 


4406 


6192 


784C1P2B_519 


7025 


83b 


2621 


4 4 07 


6193 


784CIP2B_520 


7025 


836 


2622 


4408 


6194 


784CIP2B_521 


7025 


837 


2623 


4405 


6155 


784CIP2B522 


7050 


83B 


2624 


4410 


6196 


784CIP2B_523 


7051 


BBS 


2625 


4411 


61S7 


784CIP2B S24 


7055 


840 


2626 


4412 


61 9fc 


784CIP2B 525 


7060 


041 


2627 


441? 


629b 


784CIP2B_526 


7064 


842 


2628 


4414 


620C 


784CIP2B 527 


7067 


643 


2629 


4415 


620: 


7B4C1P2B_528 


7071 


. _ . , — _ 

844 


2630 


4416 


6202 


784C1P2B529 


7072 


84S 


2631 


441*; 


6203- 


704C1P2B_53O 


7073 


846 


2632 


4418 


6204 


784CIP2B 531 


7076 


84 7 


2633 


4415 


620S 


784CJP2B532 


7088 


848 


2634 


442C 


6206 


784CIP2B533 


7089 


845 


263b 


4421 


6207 


784C1P2B_534 


7091 


850 


2636 


4422 


6206 


784CIP2B_535 


7091 


851 


2637 


4423 


6205 


7B4C1P2B536 


7104 


852 


263e 


4424 


6210 


7B4C1P2B537 


7105 


853 


2639 


4425 


6211 


784CIP2B 536 


7105 


854 


2640 


4426 


621i 


7B4C3P2B_539 


7109 


855 


2641 


4427 


6213 


7 84C3P2B_54 0 


7109 


856 


2642 


4426 


6214 


7 84CIP2B_541 


7119 


657 


2643 


4425 


6215 


784CIP2B 542 


7120 


858 


2644 


443C 


6216 


784CIP2B_543 


7121 


655 


2645 


4431 


6217 


7B4CIP2B_544 


7126 


660 


2646 


4432 


6216 


784CIP2B_545 


7127 


861 


2647 


4433 


6215 


[ 784C1P2B_S46 


7130 


862 


2648 


4434 


6220 


784CIP2B__S47 


7131 


863 


2649 


4435 


6222 ■ 


784CIP2B_548 


7144 


864 


2650 


4 436 


622* 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B_550 


7163 
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BNSDOCID: <WO .1 



0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


5E0 ID 


Priority 


SEQ IL 


Of fuli • 


NO: ol 


of contic 


NO 


| docket number 


NO: in 


length 


full - 


nucleotide 


or contig 


| corresponding 


U . S . S . N . 


nucleot ide 


length 


sequence 


pept iot 


SEQ ID NO: in 


09/488, 725 


sequence 


pept id€ 
sequence 




sequence 

.. 


priority 
appl i cat ion 




866 


2652 


443 1 


622<-. 


784C1P2B 553 


717 b 


86"/ 


265? 


4435^ 


622 b 


784CIP2B 552 


7388 


86E 


2654 


444C 


622 i 


784C1P2B 553 


i 7389 


86? 


2655 


4 443 


62 2" 


784C3 P2B_554 


7190 


870 


2656 


4442 


622 b 


784CIP2B_555 


73 9 i 


871 


2657 


4443 


622 S 


784CIP2B S56 


7203 


871 


2656 


4444 


629 0 


784CIP2B_557 


7204 


873 


2659 


4445 


62 31 


7G4CIP2B55G 


720C 


874 


1 2660 


4446 


6232 


784CJP2B 559 


7209 


875 


266 j 


4447 


6231- 


784CIP2B_560 


721G 


87*. 


2662 


4448 


6234 


784C1P2B_561 


723 6 


87'/ 


2663 


4449 


6235 


784C1P2B_562 


7221 


876 


2664 


4450 


6236 


784CIP2B 563 


7230 


875 


2665 


445i 


6237 


784CIP2B_564 


723 7 


880 


2666 


4452 


623* 


784C3P2B_565 


7240 


881 


2667 


4453 


6235* 


784CIP2B_566 


724 5 


882 


2668 


4454 


t 6240 


784C1P2B_567 


725C 


88:- 


2669 


4455 


6241 


784C1P2B 568 


7251 


884 


267C 


4456 


6241 


784CIP2B569 


7255 


885 


2671 


4457 


6241- 


784CIF2B_570 


7260 


886 


2672 


4458 


6244 


784CIF2B_573 


7265 


887 


2673 


4459 


6245 


784CIF2B_572 


726e 


see 


2674 


4 4 60 


624£ 


784C1P2B_573 


7275 


889 


2675 


4461 


624', 


784CIP2B 574 


7279 


890 


2676 


4462 


624 h 


784C1F2B_575 


7283 


89: 


2677 


4463 


6245 


784CIF2B_576 


7283 


892 


2678 


4464 


625C 


784CIP2B_577 


72 87 


893 


2679 


4465 


6253 


784CIP2B578 


7301 


894 


2680 


4466 


6251 


784CIF2B_579 


7308 


B9S 


2681 


4467 


6253 


784CIF2B_580 


7306 


896 


2682 


4468 


6254 


784C1P2B S81 


7309 


897 


2683 


4469 


€255 


784CIP2B582 


733 9 


896 


2684 


4470 


6256 


784CIP2B 583 


7320 


899 


268S 


4473 


62 57 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


903 


2687 


4473 


625S 


784CJP2B 586 


7334 


902 


2688 


4474 


626C 


784CIP2B_587 


7337 


902 


2689 


4 4 75 


6263 


784CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4 4 77 


6263 


784CIP2B 590 


73 S 5 


90* 


2692 


4476 


6*2 6*4 


784CIP2B_591 


7363 


907 


2693 


4479 


626i 


784CIP2B592 


7363 


908 


2694 


4480 


6266 


784C1P2B 593 


7365 


905 


2695 


4481 


6267 


784CIP2B_594 


7368 


910 


2696 


44 82 


626fc 


784CIP2B 595 


736 9 


913 


2697 


4483 


6265 


784CIP2B 596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7 37 b 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


44 86 


6272 


784CIP2B_601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B_603 


7391 


917 


2703 


4489 


6275 


784CI P2B_604 


7393 




2704 


44 90 


6276 




_ 7395 


919 


2705 


4491 


6277 


784C1P2B_606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7395 


921 


2707 


4493 


6273 


784CIP2B608 


7405 


922 


2708 


4494 


6280 


784CIP2B 609 


7406 


923 | 


2709 


4495 


6283 


784CIP2B 610 


7406 


924 ) 


2710 1 


4496 


6282 


784CIP2B_611 


7409 


925 j 


2711 


4497 I 


6283 


784CIP2B_612 


7410 


926 


2712 


4498 * 


6284 


784CIP2B_613 


7411 


927 


2713 


4499 | 


6285 


784CIP2B 614 


7417 
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BNSDOCID: <WO_0153312A1 J_? 



WO 01/53312 



PCJ7US00/3-4263 



5; £5 ID NO: 1 


sko ir; 


S2Q ID NO: 


SEO ID 


Priority 


SEQ ID 


cf rui: 


NO: Of 


of contig 


NO: 


docket r.unbe.r_ 


NO:m 


jenath 


full- 


nucleot ide 


ot contig 


corresponding 


U . S . S . N . 


:r.>c0 ect ide 


length 


sequence 


peptide 


SEO ID NO: m 


09/488,725 


tcquencc 


peptide 
sequence 




sequence 


priority 
application 




h 528 


2714 


4 5CC 


6286 


784CIP2B_615 


7418 


S25 


2715 


4 50!' 


62 87 


784C1P23_616 


7421 


930 


2716 


4 so; 


6288 


784CIP23617 


7422 


531 


2 717 


4 50? 


€289 


784CIP23_6:8 


7422 


932 


2718 


4 504 


6290 


784C1P2B_619 


7423 


93 3 j 


2719 


4 505 


6291 


784C1P23_€20 


7424 


93- 


2720 


45C<- 


6292 


784CIP23 621 


7426 


935 


2721 


4 5C7 


6293 


784CJP23 622 


7427 


936 


2722 


4508 


6294 


784C1P23, 623 


742B 


937 ■ 


2723 


4 508 


6295 


784CIP29 624 


74 3 0 


936 


2724 


4 51C 


6296 


784C1P2B_625 


7435 


93 5> 


2725 


4511 


6297 


784CIP2B 62 6 


7437 


340 


2726 


4511 


6298 


784C1P23_627 


7439 


94 a 


2727 


4 513 


6299 


784CIP2B_626 


7440 


942 


2726 


4514 


6300 


784C]P23_G25 


7442 


943 


2729 


4511 


6301 


784C1P2B 63 0 


7450 


944 


2730 


451* 


6302 


784C1P23_631 


7451 


94 5 


2731 


4 517 


6303 


784C1P2B_ 632 


7452 


94fc 


2732 


4518 


6304 


784CIP23_633 


7454 


94 7 


2733 


4S18 


6305 


784C1P2B 634 


7457 


948 


2734 


4 52C 


6306 


784CIP2B 635 


7459 


949 


2735 


4523 


6307 


784CIP2B_636 


7461 


95 C 


2736 


4S22 


630B 


784C1P23 b37 


7463 


95j 


2737 


452:- 


6309 


784CIP2B 63 8 


7466 


952 


2738 


4524 


6310 


784CIP2B_639 


7469 


953 


2739 


4525 


6311 


784C1P23 640 


7473 


95^ 


2 74 0 


452fc 


6312 


7B4CIP2B 641 


7481 


955 


2741 


4527 


6313 


784C1P2B 642 


7482 


956 


2742 


4528 


6314 


784C1P2364 3 


7482 


957 


2743 


4525 


6315 


784C1P2B644 


7483 


956 


2744 


4530 


6316 


784C1P2B_64 5 


7485 


959 


2745 


4531 


6317 


784CIP23 64 6 


7486 


960 


2746 


4S32 


[ 6316 


784CIP2B_647 


7487 


961 


2747 


4 53 3 


6319 


784CIP2B648 


7491 


962 


2748 


4534 


6320 


784CIP2B 645 


7492 


963 


2749 


4535 


6321 


7B4CIP2B650 


|" 7494 


964 


2730 


453£ 


6322 


784C1P23 651 


7498 


965 


2751 


4537 


6323 


784C1P2BG52 


7504 


966 


2752 


4538 


6324 


784C1P23653 


7508 


967 


2753 


4539 


6325 


784C1P2B654 


7516 


968 


2754 


4540 


6326 


784C1P2B 685 


7518 


96 5 


2755 


4541 


6327 


784C1P2B_650 


7519 


970 


2756 


4542 


6328 


784CIP2B_657 


7521 


973 


2757 


454 3 


6329 


784C1P2B656 


7529 


972 


2758 


4544 


6330 


784C1P2B 689 


7532 


973 


2759 


4548 


6331 


784CIP2B 660 


7533 


974 


2760 


454f 


6332 


784C1P2B_661 


7535 


975 


2761 


4 54 7 


6333 


784CIP2S662 


7545 


S7t 


2762 


454£ 


6334 


784C1P23_663 


i 7546 


97? 


2763 


4545 


6335 


784C1P23664 


7552 


978 


2764 


455C 


6336 


784C2P2B_665 


7554 


975 


2765 


4551 


6337 


784C1P2B666 


7567 


960 


2766 


4552 


633B 


7B4C1P23_667 


7569 


981 


2767 


4 553 


6339 


784C1P2B666 


7575 


982 


2768 


4 554 


6340 


784CIP23_665 


7575 


583 


2769 


4555 


6341 


784C1P23_670 


7S77 


984 


2770 


4556 


6342 


784C1P2B671 


7579 


985 


2771 


4557 


6343 


784C1P23_672 


7582 


986 


2772 


4558 


6344 


784C1P2B_673 


7587 


987 


2773 


4555 


6345 


784C1P23_674 


7589 


986 


2774 


4 560 


6346 


784C2P2B_675 


7597 


98S 


277S 


4561 


6347 


784CIP2B_6 76 


7597 
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BNSDOCID: <WO 0153312A1 ,1. 



WO 01/5331? 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NC: 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO: of 


of ccntig 


NO: 


docket number^ 


NO: in 


1 eng t h 


full- 


nucleotide 


ct contig 


correspond i ng 


U.S. S.N. 


nucleotide 


length 


sequence 


pept ade 


SEQ ID NO: in 


09/488, 725 


sequence 


pept yae 
sequence 




sequence 


priori ty 
application 




990 


2776 


4562 


634 6 


7 84C3P2 3_677 


7609 


S9\ 


2777 


4563 


6345 


784C1P2B_676 


7609 


99* 


2778 


4564 


6350 


784CIP2B_679 


7609 


993 


2779 


4565 


63SI 


784CIP2B 680 


7613 


994 


2780 


4 560 


6352 


784C1 P23_66j 


7623 


99S 


27B1 


4567 


6353 


784CIP23 682 


7629 


S9i 


27S2 


456f 


6354 


784CIP2B683 


7630 


331 


2783 


4565 


635S 


784C1P2B C84 


7633 


998 


2784 


4570 


6356 


784CI P2B685 


7635 


995 


2785 


4 571 


6357 


784ClP2B_68e 


7638 


1000 


2786 


4572 


6358 


784CIP2B687 


7639 


100) 


2787 


4573 


6359 


784C1P2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B685 


7647 


1003 


2709 


4575 


6362 


784CIP2B_690 


7648 


1004 


2790 


4576 


6362 


784CIP2B691 


7658 


1005 


2791 


4577 


6363 


784C1P2B_692 


7664 


iooe 


2792 


<157fi 


6364 


784CIP2B 693 


7664 


1007 


2793 


4579 


6365 


784C1P2B695 


7674 • 


1006 


2794 


4580 


6366 


784CIP2B696 


76 7 5 


10 05 


2795 


458: 


6367 


784CIP2B 697 


7676 


1010 


2796" ' 


4582 


6368 


784CIP2B69B 


768- 


lOU 


2797 


4 583 


6369 


784C3P2B_699 


7688 


1012 


2798 


4584 


6370 


784CIP2B^ 700 


76 9 3 


1013 


2799 


4585 


6371 


784CIP2B_701 


7694 


1014 


2800 


4586 


6372 


7 84CIP2B_702 


77 3 5 


1015 


2801 


4587 


6373 


784CIP2B..703 


7716 


1016 


2802 


458b 


6374 


784CIP2B_704 


7718 


101? 


2803 


4589 


6375 


784C1F2B705 


7721 


1018 


2804 


4590 


6376 


784C1P2B_70€ 


7723 


1019 


2805 


4593 


6377 


784CIP2B_ 707 


7729 


1020 


2806 


4592 


6378 


784CIP2B_708 


7733 


1021 


2807 


4593 


6379 


784CIP2B_ 709 


7735 


1022 


2808 


4594 


1 6380 


784C1P2B_710 


7741 


1023 


2809 


4595 


6381 


7B4CIP2B_7ll 


7743 


1024 


2810 


4596 


6382 


784CIP2B_712 


77*6 


1025 


2811 


4597 


6383 


784CIP2B_713 


7749 


1026 


2812 


4598 


63B4 


784CIP2B_714 


775C 


102 7 


2813 


4599 


6385 


784CIP2B715 


77S7 


1028 


2814 


4600 


63 06 


784CIP2B_716 


7759 


102 5 


2815 


4603 


6387 


784CIP2B_717 


7760 


1030 


2816 


4602 


6388 


7B4CIP2B_718 


7760 


1031 


2817 


4603 


6389 


78 4CIP2B_719 


7764 


1032 


2818 


4604 


6390 


764CIP2B720 


7765 


1033 


2819 


4605 


6391 


784CIP2B_721 


7766 


1034 


2820 


4606 


1 6392 


7H4CIP2B_722 


7767 


1035 


2821 


4607 


6393 


784C1P2E_723 


7769 


1036 


2822 


4608 


6394 


784CIP2B_724 


7770 


1037 


2823 


4609 


6395 


784C1P2B_725 


7774 


1036 


2824 


4610 


6396 


784CIP2B_726 


7779 


1035 


2 825 


4611 


6397 


784CIP2B 727 


7781 


1040 


2826 


4612 


6398 


784CIP2B_728 


7782 


1041 


2827 


4613 


6399 


j 784CIP2B_729 


7783 


1042 


2828 


4634 


6400 


784CIP2B_730 


7787 


1043 


2829 


4615 


6401 


784CIP2B_731 


7792 


1044 


2830 


4616 


6402 


784CIP2B_732 


7795 


1045 


2631 


4617 


6403 


784CIP2B 733 


7801 


1046 


2832 


4616 


6404 


784CIP23_734 


7807 


1047 


2B33 


4615 


640S 


784CIP29_735 


7008 


1048 


2834 


4 62 0 


6406 


784CIP23_73 6 


7819 


1049 


2835 


4621 


6407 


784C1P2B_737 


7824 


10SC 


2836 


4622 


6408 


784CIP2B_73 8 


7826 


1051 


2837 


4623 


6409 


784C1P2B_739 


' 7829 
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BNSDOC'D: <WO 01 53312A1_l. s 



WOOI/.V3312 



PCI7US00/34263 



rSE0"lt : "N"O: 


~SF.Q ID " 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full - 


NC: of 


of cor.tig 


NO; 


docket number_ 


NO; in 


1 length 


full- 


nucleotide 


of contig 


corresponding 


U.S.S .K. 


| nucleotide 


length 


.semience 


pepti ce 


ScQ ID NU : in 


09/488 , 72b 


| sequence 


pept ide . 




secjuence 


pricri ty 

O LJ I— > Jl J v_ n L > <J1 , 






'QIC 


4 624 


64 1 C 


Tp^PTP^R 74 n 


7832 


1 At-. 




^ o* - 


64 1 j 


7fi 4 r*T P?K 7 A 1 


7835 


1 C 54 


28^0 


4 626 


64 1 2 


7H40TP2P 7 4^ 


7647 


1 C 5 5 


2841 


z £ 7 7 


6413 


7fi4PT P7P 744 


7 B46 


JUjC 




4 6 2 F 


6414 


7P4CTP7R 74R 


7 fi 53 1 


±\,iil 


ZC H J 


4 625 


64 1 r> 


7fiirTPPR 74fi 


7 8 54 




2844 




64 1 C 




7 856 


\ 1059 


2 64 S 




64 1 7 


7P4PTP7R 74P 


7862 


! 106 0 


2846 




6418 


7RAC1 D7P 74 Q 


7865 


1 06 4. 


2847 


4 633 


6415 






1062 


284 8 


4 634 




7R4P7P7R It,-, 


i 7877 


1063 


284 9 


4 635 


b h x. i 


/0*)V_lF<dl3 ZD/ 


7 ft ftn 
/ 0 ou 


1064 


2850 


4636 


6422 




7882 


i iots 


2851 


4 63 7 


6423 


TD/I^T Don OC J 


78 84 


1066 


2852 


4638 


64 24 


78 4 CI P2B_755 


7886 


1067 


28 5? 


4639 


64 25 


/84CI P2B_ 7b6 


7888 


1066 


2 854 


4640 


642 6 


7B4CIP2B 757 


i b q c 


1069 


2855 


4641 


6427 


784C1 P2B_758 


i 7901 


1070 


285€ 


4642 


6428 


784CIP2B 759 


7910 


1071 


285"/ 


4643 


642 9 


784CIP2B760 


7911 


1072 


2858 


4644 


643 0 


784C1P2S_ 761 


7921 


107?. 


2855 


4645 


643 : 


?84C1P2B_762 


7923 


1074 


286C 


4646 


6432 


704C1P2B_763 


7924 


1075 


2861 


4 64 7 


6433 


784C1F2B_764 


7925 


1076 


2862 


4648 


6434 


784C1 P2B_765 


7 928 


1077 


2863 


4649 


6435 


7B4C1P2B _766 


7929 


1078 


2864 


4650 


6436 


784CIP2B_767 


7930 


1075 


2865 


4651 


6437 


784C1P2B_768 


7934 


1080 


2566 


4652 


6438 


784C1P2B_769 


7938 


1081 


2867 


4652 


643S» 


784C1P2B770 


7942 , 


1082 


2866 


4654 


6440 


784C1P2B771 


7945 , 


1083 


2B69 


4655 


6441 


784CIP2B_772 


7946 ] 


1 10B4 


2870 


4656 


6442 


784CIP2B^_773 


7548 | 


108b 


2871 


4 657 


644? 


784C1P2B_774 


7951 


10B6 


2872 


4658 


6444 


784C1P2B_775 


7952 


1087 


2873 


4659 


644S 


784C1P2B776 


7953 


1086 


2874 


4660 


644 6 


784CIP2B_777 


7954 


108? 


2875 


4 661 


644 7 


784CIF2B_778 




1090 


2876 


4662 


644 6 


784CIP2B 779 


7958 


10 91 


2877 


4 663 


644 9 


/o^tlr^b /oil 


/ J»o j. • 


1092 


— " — n ana 




6450 




7965 ' 


1093 


2 879 


4 665 


6451 




7G£ c 


1 094 


2880 


4 666 




7ftAr , io7n nni 


'if I if 




7ftfii 


A C C7 

h b b / 




7 AC rTDTR 7fli 
/ O ** l_ 1 trX Cr /On 


7986 




2 882 


4 boo 






7986 


1097 


n o o 1 
/. P Dw 


c a o 
4f>p5* 




7fl4rTP5R 7ft<» 
/o*t\^ir^.D /op 


7988 




2 8 84 




6456 


7H4C1P9B 7B7 


7991 




2 BBS 


4.671 


- T4 57 


7B4CIP2B 7ftB 


7992 


11 00 


2 8 66 


4 £ 77 


64 58 


7ft4r"TP9R 7fl9 

/ o " L 1 r i- D f 07 


7992 j 


1 1 01 


2887 


£ £ 7"* 


6459 


7R4PTP5R 790 


7992 


1102 


2 888 


A CI A 

4 b /4 


£4 fcn 


7fl4P7P7P. 7Ql 


7992 


Tin* 


2 889 


4 675 


CJI1 

b sci 




9003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


1105 


2091 


4677 


6463 


784CIP2B_794 


801S 


1106 


2892 


4678 


6464 


784CIP2B_795 


8016 


1107 


2893 


4679 


"64o5 


784CIP2B_796 


8017 


1108 


2894 


4680 


6466 


784CIP2B_797 


8019 


110? 


2895 


4661 


6467 


784CIP2B_798 


8020 


1110 


2696 


4682 


6469 


784CIP2B_799 


8022 


1111 


2697 


4683 


6469 


784CI?2B_800 


8022 


1 1112 


2698 


4684 


6470 


784CIP2B_801 


8028 


! 1113 " 


2899 


4685 


6471 


784CiP2B_802 


8030 
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BNS00CID:<W0 01533l2A1J_r 



WO 01/533)2 



PCT/US00/34263 



r.y i u r*LJ . 


£ EQ J D 




SEQ I D 






Of fuli- 


NO : of 


of rcintlo 


NO : 


docket nunnber 


NO : in 1 


length 


full - 


nucl eot ide 


cf contig 


| corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


pept ide 


| SEQ ID NO; in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


1 priority 

1 application 


! 


1114 


290t 


4686 


6472 


764C1P2H603 


8038 1 


ni5 


290j 


4 6 87 


6473 


784CIP2E^804 


8042 | 


1126 


290: 


468t 


6474 


1 7&4CIP2fc_B05 


8045 | 


1117 


290? 


4685 


6475 


764CZP2B_806~ 


8045 


1116 


2904 


4690 


6476 


784CIP2E_807 


8046 J 


1115 


2901 


I 4691 


6477 


784C1P2B^808 


8047 


112C 


290t 


, 4692 


64 78 


784CIP2B609 


8051 


1121 


2900 


4693 


6479 


784CIP2B_810 


8059 


1122 


2908 


4694 


6480 


i 784CIP2B_8ll 


8064 


1123 


2909 


4695 


6481 


784C1P2B812 


8069 


1124 


2910 


4696 


6482 


784CIP23813 


8074 


112S 


j 2911 


4657 


6483 


784C1P2B814 


8077 


1126 


2912 


4698 


64 84 


784C1P22815 


8078 


1127 


2913 


4699 


6485 


784CIP2B_816 


8079 


1126 


2914 


4 700 


6466 


784C3P2B_837 


8084 


1125 


2915 


4701 


6467 


784CIP2B 818 


8088 


1130 


2916 


4702 


6488 


7&4C1P2B_819 


l 8090 ' 


1131 


291*> 


4 703 


6489 


784C1P2B 820 


8091 


1132 


29ie 


4 704 


6450 


764C1P2B 821 


8099 


1133 


2915 


4 70S 


6491 


784C1P2R 822 


8099 


1134 


2920 


4706 


64S2 


784CIP2B 823 


8100 


1135 


292 1 


4 707 


6493 


784C1P2B 824 


81 02 


113 6 


2922 


4 706 


64 94 


784CIP2& 825 


8103 


1137 


29 22 


4705 


6495 


784C1P2B 826 


8103 j 


1138 


2924 


4 710 


64 96 


784CIP2B 827 


8104 ] 


113 9 


292S 


4711 


6497 


784C1P2B 828 


8108 


1140 


292t 


4 712 


64 96 


784CIP2B 829 


8110 i 


1141 


2927 


4 713 


64 9S 


784C1P2B 830 


8116 


114 2 


2926 


4 714 


6500 


7B4CIP2B 831 


8117 


1143 


2925 


4715 


5501 


784CIP2B 832 


8123 


1144 


2930 


4716 


6502 


784CIP2B 833 


8130 


114 5 


2931 


4 71 7 


6503 


784CIP2B 834 


8330 


1146 


2932 


4 718 


6504 


784CIP2B 835 


8143 


1147 


2933 


4 719 


6505 


7S4CIP2B 836 


82 4 3 


1148 


2934 


4 720 


6506 


784CIP2B 837 


| 8154 


1149 


2935 


4 721 


6507 


784C1P2B 838 


8155 


1 ISO 


2936 


4722 


6508 


784CIP2B839 


8362 


1151 


2937 


4 723 


6509 


7e4CIP2B_840 


816-3 


1152 


2938 


4 724 


6^10 


784CIP2B_8 41 


8172 


1153 


2935 


4725 


6511 


784CIP2B 842 


&173 


1154 


294 0 


4726 


6512 


784CIP2B_843 


8175 


1155 


2941 


4 727 


6523 


784CIP2B_844 


8162 


1156 


2942 


4728 


6514 


7B4C1P2B_84 5 


8183 


1157 


2943 


4729 


6515 


784CIP2B 046 


8184 


I 1158 


2944 


4730 


6516 


784CIP2B847 


8185 


1 2155 


294 5 


4 732 


6517 


784CIP^B 848 


8187 


j 1160 


2546 


4732 


651B 


784CIP2B849 


8188 


( 1161 


294 7 


<733 


6519 


784C1P2B850 


8190 


1162 


2548 


4 734 


6520 


784CIP2B^851 


8150 


1163 


294 9 


4735 


6521 


784CIP2B8S2 


8192 


1164 


2950 


4736 


6522 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784CIP2B 854 


8197 


1166 


2952 


4738 


6524 


784C1P2B^855 


8197 


116-7 


2953 


4 739 


652S 


784CIP2B_B56 


8195 


1168 


2954 


4740 [ 


6526 


784CIP2B_8S7 


8202 


1169 


2955 


4741 


6527 


784C1P2B 858 


8203 


1170 


2956 


4742 


6528 


784CIP2B_859 


8208 


1171 


29S7 


4743 


6529 


784CIP28_860 


8209 


1172 


295fi 


4744 


6530 


784CIP2B^ 861 


8211 


1173 


2959 


4745 [ 


6 531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


764CIP2B^ B63 


8217 


1175 


2961 


4747 


6533 


784ClP28 w 864 


8223 
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BNSDOCtD: <WC_ 0153312A1 J_- 



WO01/53312 



PCT/l)$im/34263 



SEQ ID NO: 


5 EC? IC 


SEQ ID NO: 


SEQ 11 


j Priority 


SEQ It 


of full- 


NO: oi 


oz contig 


NO: 


j docket number^ 


NO : m 


length 




nucleotide 


oi contig 


corresponding 


U.S. S.N. 


nucleotide 


lengt 


sequence 


pept id'- 


SEQ ID NO: in 


09/488, 72? 


seouence 


pept ic<- 
sequence 




sequence 


priority 
appl icat ion 




1176 


.296; 


4748 


653 l 


■?84C1P2B_865 


8224 


1177 


296} 


4749 


6535 


784CIP2B_866 


622t> 


1178 


296< 


4750 


653< 


j 784C1P2B_867 


8227 


1179 


296- 


4751 


653'. 


784C1P2B_868 


e229 


1180 


296f 


4752 


653^ 


7 84CIP2B__869 


8232 


1181 


296: 


4753 


t 653 " C 


784CIP2B_870 


8236 


1182 


296* 


4754 


654C 


784C1P2B_871 


8239 


1103 


296S 


4 755 


6541 


784C1P2B_872 


8244 


1184 


29 7C 


4756 


6542 


784CIP2B_873 


8245 


118S 


297: 


4757 


654? 


784C1P2B 874 


824 8 


1186 


297: 


4758 


654 4 


784CIP2B_875 


825: 


1187 


297} 


4759 


654 L 


784CIF2B_876 


8253 1 


1188 


2974 


4760 


6546 


784CIP2B_877 


8260 


1189 


2975 


4761 


6 54 7 


784CIP2B_878 


8262 


1190 


297t 


4762 


6 54f 


784CIP2B_879 


8268 | 


1191 


297', 


4 7 63 


6 54f- 


784CIF2B_8BO 


8270 


1192 


297^ 


4764 


655C 


784CIP2B_881 


8272 


1193 


297^ 


4765 


6552 


784CIP2B_882 


8274 


1194 


298( 


4766 


6S5^ 


784CIF2B_883 


8274 


119S 


2981 


4767 


G55C- 


784CIF2B_ee4 


8275 


1196 


f 2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


298> 


4769 


655b 


784C1P2B_8B6 


8281 


1198 


2984 


4770 


655t 


784CIP2B_887 


8283 


1199 


2981 


4771 


6 55-- 


784CIP2B_888 


8285 


1200 


298t 


4772 


6SSfc 


784CIP2B_889 


8295 


1201 


2987 


4773 


655S- 


784CIP2B_B90 


83DO 


1202 


298f 


4774 


6 56 0 


784C1P2B_891 


8303 


1203 


2985 


4775 


6561 


784C1P2B_892 


8304 


1204 


299C 


4776 


6 56: 


784CIP2B893 


8305 


1205 


2991 


4777 


6 36j 


784CIP2B_894 


8309 1 


1206 




477B 


6 564 


784C1P2B_895 


83ie 


1207 


259;* 


4779 


6 561 


784CIP2B_896 


8319 


1208 


2994 


4780 


656( 


784CIP2B_B97 


8321 j 


1209 


2591 


4781 


6 567 


784CIP2B_898 


8322 J 


1210 


299< 


4782 


656fc 


784CIP2B_899 


8323 


1211 


2997 


4783 


6 56 9 


784CIP2B_900 


8325 j 


1212 


2 996 


4784 


657C 


784CIP2B 901 


8331 I 


1213 


2995- 


4785 


6573 


784CIP2B_902 


8332 } 


1214 


300C 


4786 


6572 


784CIP2B_903 


8333 j 


1215 


300 


4787 


6573 


784CIP28_904 


8335 | 


1216 


300? 


4788 


6574 


784C1P2B_905 


8336 | 


1217 


300? 


4789 


657!: 


784CIP2B_906 


8337 


121B 


3004 


4790 


657t 


7B4CIP2B_907 


8340 


1219 


300b 


4791 


657"? 


784C1P2B_90& 


8343 


1220 


300f 


4792 


657* 


784CIP2B 909 


834 7 


1221 


3 007 


4793 


6S7S- 


784CIP2B_910 


8345 


1222 


300L 


4794 


658C 


784CIP2B„911 


8351 


1223 


300V 


4795 


658-j 


784CIP2B_912 


S3 53 


1224 


301C 


4796 


6582 


784CIP2B 913 


8355 


1225 


301i 


4797 


6583 


764CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP28_915 


8365 


1227 


301? 


4799 


658b 


784CIP2B_916 


8367 


1228 


301 *» 


480O 


6586 


784CIP2B_917 


8369 


1229 


301i 


4801 


6587 


784C1P2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


658S 


784CIP2B_921 


8391 


1232 


301* 


4804 


6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


83 94 


1235 


3023 


4807 


6 591- 


784CIP2B_925 


8395 


1236 


302; 


4808 


6594 


784CIP2B_926 


8396 


1237 


3023 


4809 


6595 


784CIP2B 927 


8398 
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BNSDOCID <WO 0153312A1J_> 



WOU 1/5331? 



PCT/US00/34263 



cpn "> I") NO- 


SEQ I L 




5tO i n 


Priority 




of full- 


HO: OJ 


of con tic 


NO : 


docket number 


NO : > r. 


1 en? t h 


f ul 1 - 


nucl eot i 


ot contig 


cor respond inn 


U . S . S .K 


nuc]eot ide 


length 


sequence 


pept ior 


SE0 ID KO: in 


09/488, 725 




peptide 
sequence 




sequence 


priority 
appl i cot ion 






3024 


481C 


65St 


784CIP23_926 


840^ 


123$ 


3025 


4811 


6557 


7&4C1P2B_925 


84 02 


124C 


302* 


4812 


6596 


784CIP2B_93C 


8405 


1241 


3 0 27 


4813 


6595 


784CIP2B 931 


8406 


1242 


302t 


4814 


660C 


784CIP2B_932 


6405 


124} 


3029 


4815 


660. 


784CIP2B_933 


8410 


1244 


3030 


4816 


66 02 


784CIP2B_934 


84 14 


124S 


3031 


4 817 


6603 


784CJP2B935 


84 3^ 


1246 


3032 


4810 


6604 


784C3P2B_936 


8415 


1247 


3033 


4815 


6605 


784CIP2B_937 


842* 


1246 


3034 


482C 


6606 


784CIP2B_938 


843C 


1/49 


3035 


4823 


66 0? 


784CIP2B 939 


843. 


125C 


3036 


4 822 


6606 


784CIP2B_940 


6432 


1253 


303 7 


4823 


66 05 


7 84CIP2B_941 


8433 


1252 


3036 


4824 


6610 


784CIP2B_942 


8434 


1253 


3035 


4 825 


66 31 


784CIP2B_943 


843fc 


1254 


3040 


4826 


6612 


784CIP2B 944 


8435 


1255 


3 04 1 


4 827 


6613 


784CIP2B_945 


B44i 


1256 


3042 


4828 


66 14 


784CIP2B_946 


845C 


1257 


3043 


4829 


6635 


784CIP2B_947 


8453 




3044 


4830 


6616 


784C1P2B_948 


6452 


1255 


3041 


4 833 


6617 


784CIP2B 949 


8460 


126C 


3046 


4 832 


66 lb 


784C1P2& 950 


8463 


1261 


3047 


4 833 


6615 


784CIP2B 951 


84 61 


1262 


3046 


4634 


66 2 C 


784CIP2B 9S2 


6464 


1263 


3045 


4 835 


662j 


784CIP2B 953 


8465 


1264 


3050 


4836 


6622 


784C1P2B 954 


8467 


1265 


3051 


4 H37 


6623 


784C1P2B 955 


847C 


1 " 1266 


3052 


4838 


662 4 


784C1P2B 956 


B473 


1267 


3053 


4839 


6625 


784CIP2B 957 


94 73 


1268 


3054 


4840 


6626 


784CIP2B_956 


8474 


1269 


3055 


4841 


6627 


784CIP2B 959 


847b 


1270 


3056 


4842 


6628 


784C1P2B 960 


847* 


1271 


3057 


4843 


6625 


784CIP2B_961 


84 8C 


1272 


3058 


4844 


6530 


784CIP2B_962 


8482 


1273 


3059 


4 645 


6531 


784C1P2B 963 


348> 


1274 { 3060 


4846 


6632 


784C1P2B 964 


848t 


1275 


" 3061 


4 847 


6633 


784CIP2B 965 


84 06 


1276 


3062 


4848 


6634 


784CIP2B_966 


8492 


1277 


3063 


4849 


6635 


784CIP2B_967 


84 94 


1276 


3064 


4850 


6636 


784C1P2B 968 


8496 


3275 


3065 


4851 


6637 


784CIP2B_96 9 


84 97 


32BO . 


3066 


4852 


6636 


784CIP2B_970 


8499 


1281 


3067 


4853 


6635 


784CIP2B_573 


8512 


1282 


3068 


4854 


6640 


784CIP28_972 


8522 


1283 


3069 


4855 


6641 


784CIP2B_573 


8526 


3284 


3070 


4856 


6642 


784CIP2B_974 


853: 


3285 


3071 


4857 


6643 


784CIP2B_975 


8533 


1286 


3072 


4858 


6644 


784CIP2fi_976 


8542 


1287 


3073 


4859 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B_978 


8565 


1289 


3075 


4861 


6647 


784CIP2B_979 


8565 


3290 


3076 


4862 


6648 


784CIP2B 980 


"8572 


129: 


3077 


4863 


664S 


784CIP2B_981 


8576 


1292 


3076 


4 864 


6650 


784CIP2B_982 


8576 


1293 


3079 


4865 


6651 


784CIP2B_963 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


" TS96 


1295 


3083 


4867 


6653 


784CIP2B_98S 


8602 


1296 


3082 


4868 


6654 


7B4CI?2B_986 


8604 


1297 


3083 


4869 


6655 


784CIP2B_987 


8605 


1298 


3084 


4870 


6 656 


784C1P2B_988 


8612 


1295 


3085 


4871 


6657 


784C1P2B 989 


8637 
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HNSDOOD: <W0 0153312A1. 1. > 



wo 01/5331: 



PCT/US0IV34263 





S £Q 1 L 


1 i- NVJ . 


SKQ I D 


Or ■ ny i t~ \, 
r i -Ui it j 


S KQ I D 1 


of full- 


i^O : d 


of cor. 1. 1 0 


XO . 


docket nutnber 


NO: in J 


Icr.cth 


£ul: - 


nviclcct ide 


Of contia 


correspond:* nc 


U S .S.N. ) 


r.uc i ^ot iae 


ierigth 


sequence 


pept ide 


SEQ ID KO-. ir. 


09/488,725 | 


SfcCCfettCt J 

1 


peptide 
sequence 




seauence 


priority 
appl icat 1 ok 


s 


130C | 


3086 


4 672 ] 


6G58 


7Q4C1P2\ 990 


8640 | 


130] 


3 067 


4 873 


66S9 


784C3P2B 993 


8643 | 


1302 


3086 


487<? | 


6660 


784CIP2B_ 992 


864S ( 


13 01- 


3069 


4 8?!: 


6661 


784CI?2B_ 993 


8650 I 


t 13 04 


30SO 


487f 


66C2 


784CIP2B_994 


8651 | 


130S 


3091 


4 67 7 


6663 


784C1P2B, 995 


8654 


isoe 


30S2 


4876 


6664 


784C1P2E_ S96 


86S5 


^ 1307 


3093 


4 67? 


6665 


784C1P2B 997 


8657 


130b 


3094 


4860 


6666 


784CIP2B 998 


8665 


1305 


3055 


486i 


6667 


784CIP2B_999 


8668 


131C 


30 96 


4 8 62 


6668 


784CIP2B 1000 


8671 


1313 


3097 


4 8 63 


6669 


784C2P2B_3O01 


8672 


I3TS 


3096 


4 86-: 


6670 


784CIP2B_3 002 


8692 


1313 


3099 


48c f 


6671 


784C1P23 1C03 


87C6 


1314 


310C 


4 866 


6672 


784C3P23 1004 


8716 


13 IS 


3101 


4 867 


6673 


" 784C1P2B 1C05 


8719 


1316 


310; 


4B81- 


6674 


784CIP2B_1006 


8743 


131'/ 


31 03 


4 889 


6675 


784C1P2B 1007 


6764 


1 31 1 


31 04 


4 8 90 


667 6 


784C1P2B 100B 


8764 


1 3 1 9 


3 3 0b 


4 89 j 


6577 


784C3P2B 1009 


8764 


13 20 


3 1 06 


4 8 5: 


6678 


784C3P2B 1010 


8774 


1321 


33 07 


4 893 


6679 


784C3P2B 1C33 


8782 


13 2 2 


33 06 


4 894 


6680 


784CIP2B 10l2 


8796 




3109 


4 8 9- 


6581 


784CIP2B 1033 


8827 


3 3 24 


311C 


4 8 9( 


6 6 82 


784C1P2B 1014 


8042 


1 32 e 


3111 


4 89*/ 


6603 


7B4CIP2B 1C15 


8842 


13 2 0 


'3112 


4 6 91 


6684 


784C1P2B 1016 


88S8 


1 j * / 


11 1 *i 

JUJ 


4 8 9 9 


668b 


f 0*1^ i r i □ ± \j x t 


8 871 


1 3 2 


3114 


4 90C 


6686 


784C1P2B 101E 


8921 




311s 


490} 


6687 


784C1P2B 1019 


8927 


1 ~ 3 0 


3116 


3 901- 


6688 


784C3P2B 1020 


8942 


1 ~* 3 1 


3117 


4 9 0/ 


66 89 


784C3P2B 1021 


8994 


3 3 3 2 


3116 


4 904 


6690 


784C3P2B 1022 


9023 


3 333 


3119 


45*0". 


6691 


784C1P2B 1C23 


9028 


3 334 


3120 


490* 


6692 


784CIP2B 1C24 


9058 


3335 


3121 


4 9 0 7 


6693 


784CIP2B_ 1C25 


9058 


3336 


3122 


4906 


6694 


784CIP2B_3 0> , 6 


9079 


i 3 3 7 


3123 


490^ 


6695 


784CIP2B 1027 


9079 


1336 


3124 


491C 


6696 


784CIP2B_1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B_1029 


9084 


1340 


3126 


491/ . 


669B 


784CIP2B_1030 


9093 


1341 


312 7 


491 > 


6699 


784CIP2B 1031 


9101 


1342 


3126 


4914 


6700 


784C3P2B_1032 


9103 


1343 


312£- 


4 93- 


6701 


784C1P2B_1C33 


9105 


3344 


333C 


4S3( 


6702 


784CIP2B1034 


9151 


1345:- 


3331 


4 93' 


6703 


784CIP2B_3035 


9161 


3346 


333^ 


4911 


6704 


784CIP2B_ 1036 


9172 


134? 


3133 


493 f- 


6705 


784C1P2B_1037 


9174 


1348 


3134 


4 92 0 


6706 


784CIP2B 1038 


9204 


1345* 


3135 


4923 


6707 


784CIP2B_1039 


9234 


1350 


3136 


492; 


6708 


784CIP2B_1040 


9235 


1351 


3137 


4 923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


7B4CIP2B 1042 


L " 9256 


13£3 


313S 


4925 


6711 


784CIP2B_1043 


9276 


3 354 


3140 


492* 


6712 


784CIP2B__1044 


S345 


3355 


3141 


4 92? 


6713 


7 84CIP2B_1045 


9379 


1356 


3142 


4926 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4925- 


6715 


784CIP2B1047 


9437 


1356 


3144 


493C 


6716 


784CIP2B 1048 


9469 


1355 


3145 


493: 


6717 


?64CIP2B_1049 


9500 


136C 


3146 


4932 


6718 


784CIP2B_1050 


9502 


1361 


3147 


4 933 


6719 


784CIP2B_1051 


9520 
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BNSDOClD: <WO 0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



SEO ID NO: 

of iulj- 
lengt 

nvc± eot ide 
se.yuonc.-e 


SEQ IE 
NO; oi 
full- 
length 
peptide 
sequence 


SEO LI? NO: 
of ccr.tit* 
nucleotide 
seci> nee 


| SEO ID 

NO: 
1 of contig 
1 peptide 
I sequence 

■ 


Priority 
docket numoer 
corresponding 
SEQ ID NO: in 
prior ity 
application 


SEO ID 
NO : i n 

'J . S . S . N . [ 
09/488. 725 


136^ 


314 £ 


4 934 | 672C 


784CIP2B_ 1U52 


9541 


1 3 6 y 


3 14 c j 


4.5 3 5 1 6721 


784CIF2B_1053 


| 9541 


1364 


3 15C 


1936 j 6722 


784C1P2B_1054 


954b 


1 36 i 


3151 


4 S3 7 


6723 


78<1CIP2B_105S 


9 5 56 


1 3 6 C 


3 152 


4 538 


6724 


784CIP2B_ 1056 


; 9556 


136"* 


3153 


4 929 


6725 


784CIPZB 1057 1 9575 


2 36f* 


3 1 54 


4 94 0 


6726 


784CIP2B_ 1058 ; 9589 


1365 


3155 


4941 


6727 


784C1P2B_ 1059 1 9599 


137C 


3156 


41-42 


6728 


784CIP2B_ 1060 , 9602 


1373 


3157 


*> 943 


6729 


784C1P2B_ 1061 9606 


I 372 


3158 


4 94 4 


6730 


784CIP2B_ 1062 9622 


1373 


3159 


',545 


6731 


784CIP2B2063 . 9623 


1374 


3160 


4 946 


6732 


784CIP2B_ 3064 9646 


13 75 


3163 


4 94 7 


6733 | 784CIP2B1065 ; 9747 


137£ 


3152 


4 94 e 


6734 


784CIP2B_1066 : 9773 


1377 


3363 


4 94 9 i 673 5 


784CIP2E_ 2067 | 9785 


137£ 


3164 


4 95C 


6736 


784C1P2B_1068 9801 


1379 


316b 


1951 


6737 ; 784CIP2B_1069 : 9811 


i 3 B C 


316G 


4 952 


6738 


784C1F2B_1070 , 9843 


13e3 


3167 


'4953 


6739 


784CIP2E_1071 


5854 


1382 


3168 


4 954 


6740 


764CIP2B_ 1072 


9854 


1383 


3169 


4955 


6741 


784C1F2B_1073 


9864 


2 3 04 


3170 


4:>56 


6742 


784CIP2B_1074 


9864 


3 385: 


3171 


4 9 57 


6743 


784CIP2B_ 3075 


9871 


1386 


3172 


« 958 


6744 


784CIP2B_1076 


9879 


2 38 t 


3173 


4959 


6745 


784CIP2B_ 1077 


9881 


138b 


3274 


4 96 0 


6746 


784C1P2B_ 1078 


9885 


1385 


3171 


4 961 


6747 


/84C1 YZd^ 10/9 


99 01 


1390 


317S 


4 962 


674 8 


784CIP28_ 1080 


9912 


1391 


3177 


*963 


674 9 


784CIP2E_ 3081 


9916 


1392 


3178 


4 964 


6750 


784CIP2B_ 1082 


9921 


1393 


3179 


4 96 5 


6751 


784CJP2B_ 1083 


9925 


13 54 


3180 


4 966 


6752 


784ClP2B_ 3084 


9930 


339b 


3181 


4967 


6 75 3 


784CIP2B_1085 


994 9 


139b 


3182 


4966 


6754 


784CIP2B_ 1086 


9951 


1397 


3183 


4969 


6755 


704C:P2B_ 1087 


9559 


1398 


3164 


4570 


6756 


764ClP2B_108e 


9973 


13 9 9 


3185 I 4971 


5757 


7B4C1P2B 3089 


9982 


1400 


31B6 


4 5 72 


675f 


784CIP2B_ 1090 


9994 


14 01 


3187 


49 73 


6759 


784CIP23_1091 


10021 


1402 


3188 


4974 ' 


6760 


784C1P23_1C92 


10041 


14 03 


3189 | 4S75 


6761 


784C3P2B_ 1094 


10067 


140^ 


3190 | 497b 


6762 


784C1P2B_1095 


10073 


14 OS 


3191 4977 


6763 


784CIP23_ 1096 


10112 


14 0t 


3192 1 4976 


6764 


784C1P2B_1097 


10117 


14 07 


3193 ; 4979 


6765 


784C3P2B_1098 


30332 


14 08 


3194 


498C 


6766 


7B4CIP2B1099 


10169 


14 09 


3195 


4 961 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4962 


6768 


784C1P2B_1101 


10226 


1411 


3197 


4963 


6769 


784C1P2B_ 1102 


10232 


1412 


3198 


4584 


6770 


784C1P2B_ 1103 


10237 


3 4 3 3 


3199 


45t5 


6771 


784C3P2B 1104 


10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 


143 5 


3201 


4967 


6773 


7 84CIP2C_2 


271 


1416 


3202 


49H8 


6774 


784CIP2C 3 


84 8 


1417 


3203 


4989 


6775 


7B4CIP2C_4 


849 


14Jfc 


3204 


4990 


6776 


784CIP2C_ 5 


86 4 


1419 


3205 


4S91 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


7 84CIP2C7 


980 


1423 


3207 


49 9? 


6779 


784CIP2C 8 


1595 


j 1422 


3208 


4994 


6780 


784CIP2C9 


1697 


3423 


3209 


4555 i 6781 


784C3P2C_10 


1744 
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SEC ^D NO: 


SEQ IC 


"SEQ 3D~ NO~ 


SEQ ID 


fri ci l ty 


SEQ ID j 


oC full- 


NO : of 


of contio 


NO: 


cccXcc number^ 


NO: in j 


ien^- t.*f. 


full - 


nucleotide 


of contig 


corresponding 


U.S. S.N. i 


nucleotide 


length 


sequence 


peptide 


SEO 1U NO: in 


09/488,725 j 


secv.fcnc* 1 


peptide 
sequence 




sequence 


priority 
appiic<it ion 




j 4 2 4 


3210 


4 956 


6782 


784C1P2C_11 


1937 


142S 


3211 


4 99'v 


6783 


784CIP2C_12 


1955 


1426 


3212 


499E 


6784 


784C1P2C 13 


1955 


3 4 21 


3213 


4 995 


6785 


784 CI P2C_1 4 


2185 | 


:426 


3214 


500C 


6786 


784CIP2C 15 


2889 


14 25 


321S 


500; 


6787 


76^CIP2C_16 


2901 


143C 


3216 


soo; 


6788 


784CIP2C 17 


2902 


14 31 


3217 


5003 


6789 


764CIP2C_18 


2905 


1432 


3216 


5004 


6790 


764CIP2C_19 


2946 


1433 


3219 


50O5 


6791 


784CIP2C_20 


2956 


3434 


3220 


5006 


6792 


7&4CIP2C 21 


2959 


3 435 


3221 


5007 


6793 


7S4CIP2C^22 


2965 


14 36 


3222 


5008 


6794 


7S4CIP2C_23 


2966 


3 43 7 


3223 


5009 


6795 


784CI?2C_ 24 


2970 


3430 


3224 


5010 


6796 


764CI?2C_25 


298£ 


3439 


3225 


501 J 


6797 


784CIP2C 26 


2987 


3440 


3226 _^ 


5012 


6796 


784CIP2C_27 


2993 


3441 


3227 


5013 


6799 


784CIP2C 28 


2993 


34 4 2 


3228 


5014 


6800 


784C1P2C_29 


3017 


1443 


3229 


5015 


6801 


7H4CIP2C_30 


3046 


1444 


3230 


5016 


6802 


784CIP2C_31 


3050 


144S 


3233 


5017 


6803 


784C:P2C_32 


3357 


1446 


3232 


5 016 


6804 


784C3P2C_33 


3359 


1447 


3233 


5019 


6805 


7&4C1P2C_34 


3432 


34 4 8 


3234 


5020 


6806 


784C1P2C 35 


3438 


344 9 


3235 


5021 


6807 


7&4CIP2C_ 36 


3439 


1450 


3236 


5022 


6B06 


764CIP2C 39 


3463 


14 51 


3237 


502? 


6809 


784C3P2C40 


3466 


14 5i. 


3238 


5024 


6810 


784C1P2C 41 


3466 


14 53 


3239 


5025 


6911 


784CIP2C42 


3467 


1454 


3240 


5026 


6912 


784C1P2C43 


346b 


1455 


3243 


5027 


6813 


784CIP2C_44 


34 83 


1456 


3242 


I 5028 


6814 


7 84C3P2C_45 


3484 


3457 


3243 


5029 


6915 


\ 784CJP2C_46 


348S 


1458 


3244 


5030 


6816 


784C3P2C 47 


3491 


J45S 


3245 


5031 


6 82 7 


7 84CJF2'C_4 8 


3493 


146C 


3246 


5032 


6818 


7 84 CI P2C_ 4 9 


3494 


1461 


3247 


5033 


6819 


784CJP2C 50 


349£ 


1462 


3248 


5034 


6820 


7B4CIP2C 51 


3496 


1463 


3245 


5035 


6823 


7B4CJP2C^52 


3503 


1464 


3250 


5036 


. 6822 B 


734C3P2C_53 


3503 


1465 


3251 


5037 


6823 


784C1P2C54 


3 504 


1466 


3252 


5038 


6824 


784CIP2C_55 


3511 


1467 


3253 


5039 


6825 


7B4C3P2C 56 


3531 


1468 


3254 


5040 


6826 


784C3P2C 57 


3536 


1 1465 


3255 


5041 


6827 


784CIP2C_S8 


3546 


1470 


3256 


5042 


6828 


7 84CIP2C_59 


3548 


1473 


3257 


5043 


6829 


784CIP2C_60 


3551 


1472 


3256 


£044 


6830 


784C3P2C 61 


3 553 


147? 


3259 


5045 


6831 


784CIP2C 62 


3564 


1474 


3260 


5046 


6832 


784CIP2C 63 


3567 


14 7 5 


3261 


5047 


6833 


784C1P2C_64 


3572 


1476 


3262 


5048 


6834 


784C1P2C_65 


3573 


1477 


3263 


5049 


6835 


784C1P2C^66 


3574 


1476 


3264 


5050 


6836 


7 84CIF2C_67 


3583 


1475 


3265 


| 5053 


6837 


784CIF2C_68 


3615 


148C 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CJP2C_70 


3629 


1482 


3266 


5054 


6840 


784CIP2C71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1464 


3270 


5056 


6842 


784CIP2C_73 


3906 


148S 


3271 


505" 


6643 


784CIP2C 74 


3912 
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SEQ H> NO: 
or tuu - 

nvicleot ic5f* 
sequence 


SEO ID 
NO: of 
full - 
lengrh 
pcpt ide 
sequence 


SEO ID NO: 
c: ccntio 

sequence 


SEO JO 
NO ■ 

of conti.9 
pept ide 
sequence 


Pr i or 1 1 y 
docket nunvoejr 

SEO 1 0 NC : in 

priority 

application 


SEQ ID 

MA . i r. 

U . S . S . K . i 
09/488,725 i 

j 


148f 


3272 


5056 


6844 


784C: P2C_75 


J924 j 


1487 

L 


3273~~ 1 5055 


6645 


784CIP2C 76 


3928 j 


I486 


3274 


[ SD6L 


6846 


784CIP2C_77 


3935 


1489 


3275 


5061 


6847 


784CIP2C^76 


3959 


1490 


3276 


5062 


6848 


784CIF2C_79 


3961 


1491 


3277" 


5062 


6809 


784C:P2C - 60 


3969 


149' 


327B 


5064 


6850 


764CIP2C^61 


4295 


1493 


3279 


5065 


6851 


784CIP2C,82 


4300 


149^ 


3280 


5060 


6852 


784C1P2C_83 


4360 


14 9S 


3 28l~ 


1 5067 


6853 


784CIP2C 84 


4362 


14 96 

I 


3282 


5066 


6854 


784C1P2C_85 


4371 


1 • 14 97 


3283 


5069 


6855 


784CIP2C^ 86 


4373 


14 98 


3284 


5070 


6856 


784C3P2C 87 


4376 


1499 


3285 


5071 


6867 


784C1P2C 89 


4378 


! 150C 


3286 


507^ 


6658 


784C3P2C 90 


4382 


! is"oi 


3287 


5073 


6659 


784C1P2C 91 


44C9 


I . 1502 


32 88 


5074 


6860 


784CJP2C 92 


4421 


1503 


3289 


5075 


6 861 


7A4C1P2C 9^ 


4421 


I 1S04 


3 290 


5076 


6 862 


7R4C1P2C 94 


4 42 6 


1 SOS 


32 91 


5077 


6 863 


7fl4ClP2C 9^ 


4430 


JjUd 


3292 


5078 


VOt>*i 


7fian POP 


443 5 


JO V / 


3293 


5079 


6 865 




" " 4436 


l iSUo 


3 294 


5080 


6 866 


/ O *4 V- Jl r V- -J O 


4 439 


' n c, n o 


3295 


5081 


6 867 


r Tflin POP °1 Q 

/OSUJrtL J J 


444 0 


' 1 C T A 


3296 


5082 


6 866 


*7fl ac\ pop i no 


4 441 


f 1511 


3297 


5083 


bt)b" 


*7 o * /-> ■> 0*7 p i m 
/ o * r/ j. u jl 


4 44 2 




3 2 96 


50B4 


6 870 


ojj^ptdop 1 no 

/QStlr/.V/ J- U ^ 


445 5 


1513 


3299 


5085 


03/J 




4462 


T c i i 

i-> 14 


3300 


5086 




(Olli t A. V- * 


4 4 6 6 


J D-l D 


3301 


50 _ 87 


6873 


Tn«n POP 1 ftCL 


4469 


1 DlO 


3302 


5088 


6 974 


in An pop 1 nc 


4477 


Jioj. / 


3303 


5089 


6 875 


TR^r: pop i no 


4481 


1518 


3304 


5090 


6876 


78401P7C 108 


4483 


1519 


3205 


5091 


6 877 




4484 




3306 


5092 


6 878 


7B4PIP2C. 110 


4 486 


1521 


3307 


5093 


6 879 


784C1P2C 111 


4490 


1522 


3308 


5094 


6880 


784CJP2C 112 


4499 


1523 


3309 


5 095 


6 681 


784C3P2C 113 


450? 


1524 


3310 


5096 


6882 


784CIP2C 114 


4 506 


1525 


3211 


S097 


6883 


784CIP2C IIS 


450S 


1526 


3312 


5098 


' 6864 


784CIP2C_116 


4514 


1527 


3313 


5095 


6865 


764CIP2C_117 


4516 


1528 


3314 


5100 


6886 


7B4CIP2C_118 


4522 


1525 


3315 


5101 


6887 


764CIP2C_119 


4525 


1530 


3316 


5102 


6886 


784CIP2C 120 


4 527 


1531 


3317 I 5103 


6889 


784CIP2C 121 


4528 


1532 


3318 


5104 


6890 


764CIP2C_122 


4 52S 


1533 


3319 


5105 


6891 


784CIP2C 123 


4532 


1534 


3320 


5106 


6892 


784CIP2C_124 


4537 


| 1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784C1?2C_126 


4551 


i 1537 


332 3 


5109 


6895 


784CI?2C_127 


4552 


1 1538 


3324 


5110 


6896 


784C:P2C_128 


4559 


j 1539 


3325 


5111 


6897 


764CJF2C__129 


4567 


! 1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


; 1541 j 


3327 


5113 


6 899 


784CJF2C_132 


- 458"5 


1542 


3328 


S114 


6900 


784C3P2C^133 


4592 


1543 


3329 


5115 


6901 


764CIP2C 134 


4609 


1544 


3330 


5116 


6902 


784CIP2C^135 


4616 


j 154 5 | 


3331 


5117 


6903 


784CJP2C_136 


-461-7 


j 1S46 


3332 


sue 


6904 


784CJP2C_137 


4618 


| 1547 


3333 


5119 


6905 


784C1P2C 338 


4620 



295 



BNSDOCJD- <W0_01S3312A1J_> 



WO D1/53312 



PCT/USim/34263 



SEP ID NO: 


seo in 


SZQ ID NO: 


SEO 31 


Prj ori ty 


SEQ IP , 


cf full- 


NO i o! 


of ccntig 


NO: 


docket: n\jmber_ 


NO-.ir, 


length 




nuci eot ide 


of cent It: 


ccr respond ing 


U.S. S.N. 1 


nucleotide 


Icnoch 


sequence 


pept i Of 


SEQ ID NO: in 


09/488, 725 


scq*- fence 


pept ict 
sequence 




sequence 


priority 
appj. 3 cation 




254E 


1 333<; 


I — 5120 


690( 


784CIP2C_139 


4 62 4 


1549 


333!- 


1 3121 


6 9C" 


784CIP2C_140 


4632 j 


2S50 


r 333* 


5122 


69Dc 


784CIP2C_141 


4634 


1551 


333'i 


5123 


69C<- 


764CIP2C_142 


4636 


2 552 


3336 


5124 


691 C 


784CIP2C_143 


4635 


1553 


3339: 


512S 


693' 


784CIP2C_144 


4643 


1554 


334C 


5126 


6912 


784CIP2C_145 


4644 


15SS 


334: 


5127 


cs>i;- 


784CIP2C_146 


4655 | 


1556 


3342 


5128 


6914 


784CIP2C_147 


4666 


1557 


3 34.- 


512S 


691: 


j 784C1P2C_148 


4677 


2558 


3 34 4 


5130 


69lf 


794CIP2C_149 


4677 


i559 


3345. 


• 5133 


6917 


784CIP2C .150 


4677 


< 1560 


3340 


5132 


6916 


784CIP2C_152 


4682 ) 


1561 


3 34 7 


5133 


6 915 


784CIP2C_153 


469C " 1 


1562 


3346 


S134 


692C 


784CIP2C_154 


4691 


1563 


3349 


513S 


692j 


784CIP2C_155 


4727 


1564 


335t 1 


5136 


692: 


784C1P2C_1S6 


4730 | 


1565 


335] 


5137 


692:- 


784CIP2C157 


4734 | 


1566 


3352 


513 8 


6924 


784CIP2C_158 


4757 \ 


1567 


3 3 53 


5139 


6921 


784CIP2C_159 


4764 


1568 


3354 


5140 


692< 


7B4C1P2C_160 


4786 , 


1569 


335h 


5141 


6927 


784C1P2C_161 


4793 i 


15 70 


335t 


5142 


6921' 


784CIP2C 162 


4825: | 


1571 


3357 


5143 


692? 


784CIP2C_163 


4826 j 


1572 


3356 


5144 


693C 


784C1P2C_164 


4850 


1573 


3355* 


5145 


693} 


784C1P2C_165 


4853" ~~| 


1574 


3360 


5146 


6932 


784CIP2C166 


4855. j 


1575 


3363 


514 7 


6 93;? 


784C2P2C167 


4 856 | 


1576 


336; 


514 8 


6934 


784C1P2C_ 168 


4867 j 


1577 


3362 


5149 


693! 


784C1P2C_ 169 


4865- J 


1578 


3364 


5150 


693t 


7 84CIP2C170 


4876* j 


1579 


3365 


5151 


6 93 7 


7B4CIP2C_171 


4880 


1580 


336 ( 


5152 


6936 


7B4C1P2C172 


4942 


1581 


3367 


5153 


6 93 5- 


784CIP2C_173 


4945 


1562 


3366 


5154 


6940 


784C1P2C_174 


4950 


1583 


33 6 9 


5155 


•6941 


764CIP2C_175 


4 952 


1584 


3370 


5156 


6 94 2 


784C1P2C176 


4 95< 


1585 


3371 


5157 


6941- 


784CIP2C_I7 7 


4958 


1586 


3372 


5158 


6944 


7B4C1P2C_178 


4 961 


1587 


33 73 


5159 


6945 


7B4CIP2C_179 


5590 


1588 


3374 


5160 


6941 


784C1P2C_180 


5599 


1589 


3371 


5161 


6 947 


784CIP2C_181 


5692 


1590 


3376 


5162 


6948 


784C1P2C_182 


5732 


1591 


3377 


5163 


6945 


784CIP2C_183 


5765 


1592 


3376 


5164 


6950 


7B4CIP2C_184 


5773 


1593 


3375 


5165 


6951 


784CIP2C^185 


5774 


1594 


338C 


5166 


6952 


784CIP2C_186 


579:- 1 


1595 


3382 , 


5167 


6951' 


784CIP2C^ 187 


5BCt | 


1596 


3382 ; 


5168 


6954 


784CIP2C_18 8 


5852 1 


1597 


III: —J 


5169 


695£ 


784CIP2C_189 


5BS2 J 


1598 


3384 ; 


5170 


6956 


784C1P2C_190 


6057 ( 


1599 


3385 ; 


5171 


6957 


784CIP2C_191 


6061 1 


1600 


33 86 


5172 


6956 


784CIP2C_192 


6105 


1601 


3387 


5173 


6955 


784CIP2C_193 


6160 


1602 


3386 1 


5174 


696C 


784CIP2CJ194 


6297 j 


1603 


3365 


5175 


6961 


784CIP2C_195 


63 9P : 


1604 


339C 


5176 


6962 


784CIP2C_196 


6398 


16 05 


3392 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6446 


1607 


3393 1 


5179 


6965 


784CIP2C 199 


6465 


1608 


3394 


5180 


6966 


784CIP2C_200 


6476 


1609 


3395 


5181 


6967 


784CIP2C_201 1 


6561 
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BNSDOCIO: <WO 01&331?A1_L> 



W O 01/53312 



PCT/USOO/34263 



r 5 KO ID NO : 


SEQ I r 


SZQ ID NO: 


SEO ID "7 


■"Tr jenty 


SEO ID | 


of full- 


NO: oi 


ct contic 


NO. 


docket numher^ 


NO: in i 


"length 


fill; 


r:uc3 eot i r.e 


of ccntig 


corresponding 


U . S . £ . N . • 


nucleotide 


lenaU. 


sequence 


pep*, i of 


SKQ ID NO: in 


09/488,725 j 


sequence 

- 


pept i at 
scquo rice 




sequence 


priority 
appl i ca t i on 


i 


161C 


3 3 Si 


518 2 


-- 

6968 


/ c4CI P2C_^ 02 


6 574 


1613 


3? 5'' 


5183 


696 9 




" irnc 

657 t 


; 1612 


3 3 H 


518 4 


6 970 


/C54L1P2C -^04 


6662 | 


1613 




5185 


69 71 


/'04Olr2C zUb 


6672 j 


1614 


3 4 GO 


518 6 


6972 


/o4Cl t r 2C_2 0b 


6692 


162 5 


3 4 01 


518 7 


6973 


/o«)H P2L 2U / 


6 695 j 


1616 


34C. 


51 8 8 


6974 


7b4Clf2C 208 


6746 I 


1617 


3 4 C .'■ 


51 8 S 


6975 


/b4CI.P2C 209 


6898 ' 


1618 


3 4 0', 


5190 


6976 


/b4CIi'2C_210 


6938 j 


1619 


3 4 C S 


5191 


6977 


/fi4CIP2C_211 


6943 ; 


1620 


34 0( 


5192 


69 78 


784CIP2C 212 


711C | 


1621 


34 c; 


5193 


6979 


784C1 P2C_213 


7200 


1622 


340f 


5194 


6980 


784CI?2C_2l4 


7212 


1623 


34 05- 


5195 


6981 


784CI?2C_215 


722.fi 


1624 


34 3: 


5196 


6982 


784CI?2C_2l6 


7249 


1625 


3411 


S197 


6983 


784CIP2C_217 


7500 


1626 


341/ 


5198 


6984 


784CIP2C218 


7509 ; 


1627 


3 4 i;- 


5195 


6985 


784CI?2C_219 


7 523 


1628 


341< 


5200 


6986 


784CI?2C_220 


7544 ) 


1629 


3415 


5201 


6987 


784CIP2C_221 


7564 


163C 


341t 


5202 


6988 


784C1P2C_222 


7568 


1631 


341/ 


5203 


6989 


784CIP2C_223 


7631 j 


1632 


341i 


5204 


6990 


784CI?2C_224 


7813 ~j 


1633 


341S- 


5205 


6991 


784C1P2C225 


7831 ; 


1634 


342C 


5206 


6992 


784CIP2C_226 


7843 


1635 


3421- 


5207 


6993 


784CIP2C_227 


7907 


1636 


3 4 22 


5208 


6994 


784C1P2C_228 


7943 


1637 


3 42.- 


5209 


6995 


784CIP2C_229 


8175 | 


1636 


3424 


5210 


6 996 


784C:P2C_230 


8216 j 


1635 


3 4 2i 


S211 


6997 


784C1P2C_231 


8225 | 


1640 


342t 


5212 


6998 


784CJP2C_232 


82 73 ! 


1641 


342'/ 


5213 


6999 


784C1P2C_233 


8397 


1642 


342*- 


5224 


7000 


764C1P2C_234 


8466 


1643 


3 4 21, 


5215 


7001 


7 84C1P2C_235 


8503 


1644 


343C 


5216 


7002 


784CIP2C_236 


8953 


1645 


343j 


5217 


70 03 


784C1P2C_237 


9106 


1646 


343: 


5218 


7004 


784C3P2C_238 


9139 


1647 


3 4ij 


5219 


7005 


784C3P2C_239 


9555 


1648 


3434 


5220 


7006 


764CJP2C_240 


9650 


1649 


3 4 3 1 


5221 


7 0 07 


784C1P2C_241 


98 89 


1650 


34 3e 


5222 


7006 


784C1P2C_242 


993 j I 


1651 


343" 


5223 


7009 


784CIP2C_243 


9953 | 


1652 


343b 


5224 


7010 


784CIP2C_244 


99 81 i 


1653 


3 4 3 5 


5225 


7011 


784CIP2D_1 


74 6 ' 


1 654 


34 4 L 


5226 


7012 


764C1P2D_^ 


3558 | 


1655 


34 4 j 


5227 


7013 


/84CIP2D_J 


1CCS 

J i>bo 


lob© 


34 42 


5226 


7014 


/OHLXrlu 4 


3633 


1 bb / 


3443 


5229 


7015 


/o4CiP2D_b 


^ K«;ft 


1658 


344 U 


5230 


7016 


/o4t AP2U^b 


J ' j* 


1 659 


34 4 5 


5231 


7C 1 7 


/84CI r2LJ_ / 




i c £ n 
lbbU 


344G 


5232 


7018 


764C1P2U O 




1661 


3447 


5233 


7019 


784C1P2D_9 


4703 1 


1 662 


3 4 4 t 


5234 


702 0 




4774 • 


1663 


3445 


5235 


7022 


784CIP2D_11 


4894 | 


1664 


345C 


. 5236 


7022 


784CIP2D_12 


4918 | 


1665 


345} 


5237 


7023 


784CJP2D_13 


5159 ! 


1666 


345/ 


5238 


7024 


7 84CIP2D_14 


7443 


1667 


34SI- 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


867S 


1669 


3455 


5241 


7027 


784C1P2D 17 


8727 


1670 


345fc 


5242 


7028 


784CIP2D_18 


8734 


1671 


345: 


5243 


7029 


784CIP2D_19 


8756 



297 



BNSDOCID <WO_. ..01 53312A1..L = 



WO 01/5331; 



I>lT/l.'SlM/34263 



SEQ ID NO: 


SEQ i: 


SEQ ID NO: 




SEO. ID 


Priority 


SEQ i: \ 


of full - 




NO : c : 


of centre 




NO i 


docket numbei 


NO : i 1 


length 




ful ": • 


nucleotide 




ot contig i 


corresponding 




miclect i de 


i finat:. 


sequence 




peptide 


SEQ ID NO: in 


09/4KB, 725 


sequenc.i 




pept i ce 






r.equence | priority 








sequer.ro 








appl ication 




167; 


3 4S6 


5244 




703 C 


7 84C1P2D 2 0 


boil- 


167:. 


3 4 55 


524i | 


703 3 


784 L 1 P2D_2i 


8844 


167< 




3 4 6 C 


5246 


703 2 


784C]P2D_22 


6 64 6 


167! 




34 61 


524 7 


7033 


784C1P2D 23 


8512 


1676 


H 


34C2 


5246 




7034 
7035 


784CI P2D_24 


e9ie 


1677 




346i 


5245 


784C1 P2D__25 


891 6 


167t 




34 64 


525C 




7036 


784C1 F2D_26 


8 941 


1675 




3465 


525; 


703 7 


7 84 CI 


8S4I 


166C 




3466 


5252 




7036 


784C1P2D_28 


e55i 


1681 




3467 


5253 


7035 


7 84C1P2D_2S> 


8 951 


1661 


346t 


5254 


704 C 


784C1P2D 30 


9007 


166.- 


3465 ^ 


5255 


7 041 


784C1P2D 31 


9012 


1684 


3470 


5256 


7042 


784C1P2D32 


9013 


166L 


347: 


5257 


7043 


784CIP2D_33 


902^ 


166< 




3 4 71 


5258 


7044 


784C1P2D_34 


9053 


3 68% 


[ 347:* 


5259 


7045 


784C1P2D_35 


9054 


168f 




3474 


5260 




7046 


704CIP2D 36 


9064 


168 1 


3 47 1 . 


5261 


7047 


7Q4CIP2D_37 


9113 


l69t 




347C 


5267 


7046 


784CIP2DJ38 


9134 


1691 


3 4 77 


5263 


7049 


784CIP2D_35> 


9152 


169; 


3 476 


5264 


7050 


784CIP2D_40 


9152 


1693 




3475 


5265 


7051 


784CIP2D_41 


9211 


1691 


3 4 8C 


5266 


7052 


784CIP2D_42 


9223 


16Sh 


348j 


5267 


7053 


784CJP2D_43 


9223 


l69f 


348i 


5266 


7054 
7055 


784C1F2D_44 


9231 


16 97 


— 


3483 


5269 


78 4CIP2D_45 


9236 


16 9b 


3484 


5270 


7056 


784CIP2D_46 


5236 


165)1 


3485 


5271 


7057 


784CIF2D_47 


93 03 


170C 


I 34 8* 


5272 


705* 


784CIP2D_48 


9 3 05 


1701 




3 4 87 


527 3 


7059 


784C1F2D_49 


9314 


1702 




r 3 4 86 


5274 


7060 


784C?F2D_50 


5326 


1703 


34 B5 


5275 


7C61 


784C1P2D__51 


5339 


1704 




34 90 


5276 


7062 


784C1P2D_52 


) 934 6 1 


170b 




3491 


5277 


7063 


784CIP2E»_53 


5376 


170C 


3492 


S278 


7064 


784CIP2D 54 


j 9382 


1707 




3493 


5275 


7065 


784CIP2D_55 


| 5407 


1706 




3494 


52B0 


706C 


784C1P2D 56 


5414 


170? 




3491 


5281 


7067 


784CIP2D_57 


94 3 9 


1710 




3496 


5282 


7066 


784CIP2D_58 


5485 


1711 




3 4 97 


5283 


7065 


784CIP2D59 


94 93 


1712 




3496 


5284 


7070 


784CIP2D 60 


9501 


1713 


3499 


5285 


70 71 


784CIP2D_6l 


9526 


1714 


3500 


5286 




7072 


784 CI P2D_62 


9526 


1715. 




350; 


5287 




7073 


784CIP^:D ©3 


9555 


171b 


3 5 02 


5288 


7074 


784CI?2D_64 


95 5 7 


1717 


3502 


5289 


707b 


to a :iir\ c t 
784LI*VAl_bb 


5566 


1716 


3504 


5290 




7076 


784LIP2I/_t>b 


9566 


1719 


3505 


5291 




7077 


"7R4C732D 67 


9557 


1720 


3506 


5292 


7078 


784CIP2D_6B 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 


962t 


1722 


3506 


5294 


7080 


784CIP2D_70 


9645 


1723 


3505 


5295 


7081 


784CIP2D_71 


9652 


1724 


351C 


5296 


7082 


784C1P2D_72 


5660 


172b 


3511 


5297 


7083 


784C1P2DJ73 


9662 


1726 


3512 


5298 


7084 


784C1P2D_74 


9725 


1727 
1728 




3S12 


5299 


7085 


784CIP2D75 


9746 




3514 


5300 


7086 


784CIP2D_76 


9777 


1725- 


3S15 . 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


5302 


70B8 


784CIP2D_78 


I 979C 


1731 


| 3511 


5303 


7089 


784CIP2D_75 


9842 


1732 


3516 


5304 




7090 


784C1P2D_80 


964.2 


3733 


3515 


5305 


; 7091 


784CIP2D_81 


5'84 6 
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BNSDOCID: <WO_01S3312A1J..> 



WO 01/533U 



PCT/USim/34263 



PSEO ID NO: 


SE0 1^ 


SEQ ID NC~ 


SEQ JL 


priori ty 


SEQ ID 


Of tull- 


NO : of 


of com.jc 


NO 


doc kftt 71uw.be 1 


NO: ir. 




fu23- 


r.ucleoti cie 


of contig 


corresponding 


U.S. S.?J. 


nucl cot ide 


icric^t h 


sequence 


pep: ide 


SEO ID NO: iti 


09/488, 7;S 


sequence 


^eptiui* 




secuence 


priority 







sequence 






applicat ior 




T520 


5T0"* 


7 0 92 


784CIP2D_£2 


9867 




"3521 


53 0'. 


7093 


784C1P2D_83 


. 1O01O 


\ 1736 


3S22 


S30f 


7C94 


784CIF2D^84 


1001- 


1737 


3523 


530? 


7095 


784C1P2D85 


10052 


1736 


3524 


£3H- 


7096 


784CIP2D86 


10057 


1739 


3525 


53: ; 


7097 
1 — — 


784CIP2D87 


10085 


1740 


3526 


s-r— 

53;; 


7098 


784C]P2D_85 


10139 


1741 


3527 


5311- 


7099 


7 84CIP2D_9C 


10142 


174 2 


352B 


535* 


7 IOC 


784CIP2D92 


101 €5 


1743 


3529 


533 ' 


7101 


784CIP2D_93 


10173 


1744 1 3530 


5311 


7102 


7 84C1P2D_90 


10173 


1745 


3531 


53 V 


7103 


784CIP2D_95 


10273 


1746 


3532 


531k 


7 104 


784CIP2E3 


3121 


1747 


3533 


533«- 


710S 


784CI?2E_: 


3628 


1748 


3534 


532C 


7106 


7B4CIP2E_ 4 


3673 


1/4 5 


3535 


532: 


7107 


7 84C1?2E_V. 


40ie 


1750 


3536 


532. 


7106 


784CZP2E i 


4467 


17 51 


3537 


53; ■ 


7109 


784CIP2E_ 7 


4865 


1752 


3538 


5324 


7110 


784CIP2E_8 


4916 


1753 


3539 


5321 


7111 


784C1P2E 5 


4923 


1754 


3540 


532< 


7112 


784CIP2E_10 


4926 


1755 


3541 


53 2"/ 


7113 


784CIP2E11 


4962 


1756 


3542 


532* 


7114 


784CIP2E 12 


4963 


1757 


3543 


53 2 5 


7115 


7 84CIP2E_13 


4964 1 


1758 


3544 


533C 


7116 


784CIP2E_14 


4988 


1759 


3545 


533. 


7117 


784C1P2E 15 


5835 


3 760 


3546 


53 3; 


7138 


784CTP2E 16 


7682 


1761 


3547 


5331- 


7135 


7 84CIP2E17 


7682 


1762 


3548 


5334 


7120 


784C1P2E_3E 


7699 


I 1763 


3549 


53 35 


7121 


784CIP2E_ 19 


77C7 


1764 


3550 


533f 


7122 


784CIP2E2C 


7707 


176S 


3551 


533", 


7123 


784CIP2E2J 


7752 


1766 


3552 


533f 


7124 


784CIP2E_22 


8357 


' 1767 


3553 


5335 


7125 


784CIP2E_23 


9065 


176£ 


3 5 54 


53 4 C 


7126 


784CIP2E_24 


9324 


1769 


3555 


5341 


7127 


?84C1P2F__1 


2976 


1770 


355G 


534; 


7126 


7B4CIP2F_2 


3559 


1771 


3557 


53 4 > 


7125 


7 84CIP2F_* 


4021 


1772 


35S8 


534C 


7130 


7B4CIP2F_^4 


4474 


1773 


3 55 9 


f 5341 


7133 


784C1P2F_5 


4566 


1774 


3560 


534( 


7132 


7.B4CIF2F 6 


4705 


1775 


3561 


53 4-/ 


7133 


784CIF2F_7 


4707 


1776 


3562 


5 3 4 f- 


7134 


7 84C1P2F_6 


4712 


1777 


3563 


53 4. c 


7135 


784C3P2F 9 


5008 


1778 


3564 


535( 


7136 


784C1P2F 10 


5009 


1779 


3565 


535". 


7137 


784C1P2F_31 


5015 


178C 


3566 


535? 


7136 


784CIP2?_12 


5015 


1781 


3567 


53 53 


7139 


784CIP2F13 


7724 


1782 


3568 


5354 


7140 


784C3P2F_14 


772S 


1783 


3569 


5355 


7241 


784ClP2F_lb 


8828 


1784 


3570 


535* 


7142 


784CIP2F 16 


8830 


1785 


3571 


535" 


7143 


784C1P2F17 


9739 


1786 | 3572 


535fc 


7144 


784CIP2F 18 


9896 
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TABLE 7 



ID 1 
NO : 


Predicted 
beo; nninc 
nuci eot idf. 
] ocat i on 
ccr re e ponding 
to first 
anino acic 
residue of 
amine acid 
sequence 


Frcc:cced ^nd 
mirjpot i6e 
iccf.t ion 
corresponding 
to f^roi 
rmiiiO 3 c i c. 
residue of 
anuno acid 
sequence 


Ammo ncid seamen t cc-nt amine signal pept;ce 
(A = Al?nine, CxCysteine, D=Aspartic Acid, F.t 
Glutamic Acid, F- Phenyl a J anine , G*Glycine, 
H=H: idine, I «. j toleucme , K=Lyame, 
L^beucine; M>Met hienme, N=A5=pnrecinc , 

S-Serine, T^Thrtonine. V~Valine, 
W= Tryptophan, Y = Tyrosme, X^Unknown. *=Stc;_ 
Codon, /spossibje nucleotide deletion, 
Vpcssible nucleotide insertion) 


r f.359 


337 


j 1 31 


AHLSARLSAL2LDEVA1LPAP0NLSVLSTNMKJ1LLMWSPVIAPG 
ETV YY S VEYQG E Y ES LY TS H I K I PS S WCS LTEG P BCD VTDD I TA 
TVP^-NLRVRATLGSCTS/CLEHP/VSIPLIETCPSLPDL/RMEI 
TKDGFHLVIEI.EDLGFOFEFLVA^WRRSPGAEEHVKMVRSGGIF 

VHL.L I Kj^FvrAMYL VK/'.U » r VKA ItiK i bA-r VEVSA?tAl ru 

VLAL F A F VG PM L I L WV P b F W KMGR LLQ / YL b b P RG G S SQTP W 
KITQF 


E360 

1 

1 

i 


2 


21 1 i 


PRVPi.^GGOEDFASOCUARPRFTQPSKMRRRVlARPVGSSVRLK 
CVASG>^PRPDITW^7KL•D0AL^'HPEAAEPRKKKWTrJSlJKNLRPED 
SGKYTCR VSNRAGA I NATYKVDVJ OR TRSKP VLTGTB PVNTTVD 
FGGTTSFOCKVRSDVKPVIOWLKRVEYGAEGRHNSTIDVGGOKF 
W b PT G DVW S R PTX»S Y 1-N KbLI TRAR ODD AGH Y I C LGANTMG Y S 
FRSAFbTVLPDPK?PGPPVAS£SSATSbPWFVVIGlPAGAV?lL 
GTLI.LWL.CQAQKKPCTFAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALSACPGVGbCEEBGSPAAPQHbbGPGPVAGPKLYPKbYTGHS 
TPHTYTHPPPSCQbNSSHS 


J 

i 


3 


92 5 


HEGSJSSANlLLDDQFOPKbtDFAMAHPRSHLEHOSCTINMTSS 
SSKbbW YMPEEYI ROGKLS I XTDVYS FGIV3 MEVLTGCR WbHG 
PKHJOLRDLLREbMEKRGLDSCLSFbDKKVPPCPRNFSAKbFCL 
AGRCAATRAKLR PSKDE VbKTbESTQAS bYf AEDPPTSLKS FR C 
PSFLFbENVPSlPVEDuESQNNNLbPSDEGbRIDRMTCKTPFFX 
S0SEVMFLSLDKKPF.SKKHEEA,CNMPSSSCEE5WFPKY3VP£0D 
bRFYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


i 

1 
1 

1 
1 

! 

j 
i 

i 


2 


<5 87& 


SCQVEGCTRTYNSSOS J GKHMKTAHPDQYAAFKMQRKSKKGOKA 
NNbKTPNNGKFVYFLPSPWSSNPFFTSQTKANGNPACSAObOH 
VSPPIFPAHIJISVSTPLLSSMESVINPNJTSQDKN^OGGMLCSQ 
MENb P S T ALP AQMEDbT XTV b P bN 1 DRG SDP F b S L? AES S S I DL 
FPSPADSGTNSVFSOLF.KNTNHYSSG; J EGNTNSS FbKGGNGENA 
VFPS QVNVANNFSSTNACQSA PEKVK KD RG RGQTG KEK KP KHN K 
RAKK PA 1 1 RVCKF1 CS fc CYRAFTN PR SLGGHLS KRS YCKPLPGA 
EI AC ELbQSNGQPS LLA5M I bSTNAVNLCOPQOS TFNPEACFKP 
PSFLOLLAENRSPAFLPNTFPRSGVTNFNTSVSCEGSElIIOAb 
ETAG I PSTFEGAEMbSHVSTGCVSDASOVWATVMPWPTVpPbLH 
TVCHPNTLLTNONRT.SlSi.SKTSSIEECSSLPVFPTNpr^bLKTVEN 
GLCS £ S FPNSGG PS ON FTSNSS RVSVI SG PQNTKSSH LMKKGK S 
ASKRRKKVAPPblAPNASQNbVTSDLTTMGLIAKSVEIPTTNLH 
SNVIPTCEPQSbVENLTOKLNNVNNOLFMTDVKENFKTSbESHT 
VLAF bTLKTBNG DS J NS CTTSVNSDLQ3 SEDNV I QNFEKT 
LEII KTAMNS01 LEVKSGSOGAGETSQNAOINYNlQbPSVNTVO 
NNKLFD5SP\FSSFISVMPTESNIP0SE\VSHKEDQI0EILEGb 
0KbKbEKDbSTPAS0C\O.lNTSVTLTPTPVKSTAr>ITV30PVSE 
MIN 1 C FNDKVNK P F VCC-NQGCN YS AMTKDALFKK YGK I HQYTP E 
MI LE I KKNQLKFAP FKC WPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKIKRPYGRKSOSENVPASRSTQVKKOLAMTEE!«XESO 
PALE1RAET0NTHSKVAVIPEK0LIEKKSPDKTESSLQVITVTS 
EOChTmALTNTOTKGRKIRRHKKEKEEKXRKKPVSQSLEFPTRY 
SPYRFYRCVHOGCFAAFTIQONblbHYOAVHKSDLPAFSAEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSRlFQAlTGLlQHYMKb 
HEMTPEEiES^ASVDVGKFPCDObECKSSFTTYLNYVVHbEAT) 
HG IGLRAS KTEE DGVY KCDCEGCDR I YATRS NLLRHI FtsIKHNDK 
HKA1DL I R PRR LT PGQENMSS KANQEKSKS KHRGT KHS RCGKEG I 
KMPKTK^KKKNNLENKNAKI VOI EENKPYSLKRGKHVYSI KARK 
DALSECTSRFVTQYPCKI KGCTSWTSESNI IRKYKCHKLSKAF 
TSOHRNbLIVFKRCCNSQVKETSEOEGAKl^DVKDSDTCVSESTO 
NSRT7 AT VSO KE VEKNE * DEMPELTELFI TKL I NEDS TS VETCA 
NTS SNV SNDFQEDKLCQS EROKASNLKRWKEKNVSQN KXRK\T 
XAE PAS AAELSS VRKEEETAVA i OTI EEHPAS FDK S $F KPMGFE 
VSFLKFLEESAVKOKKKTDKDHFKTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKES ET FVCFAN PSObQCSDNVK I VLDKNLKDCTEL 



300 



DNSDOCID: <W0_. 0153312A1J. > 



WO 01/f>3312 



PCT/liS<M/34?<o 



5>EC 
ID 
NO : 


Predict cc 
beginning 
nucleotide 
locot ion 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted f nc 
nucleotide 
.location 
ccr i es pone i ng 
to first 
amino acic 
residue ol 
axino acic 
secuencc 


Amino acid stcmenL containing signal per. c c tic? 1 
{A-Alanine, CCysteme, D^Aspartic Acic, F - « 
Glutamic Acid, FnPhenyl cl snine , G=G}ycine, ! 
K=Histidme, l^lsolci!cant, K=Lysine, j 
L-Leucine, M=Methionjne , N- Asparagine , 
V- Proline. QsGlutamir.e , R-Arginine, 
S=--Serine, T=Tr.reor.ine, V=Valine, ! 
W= Tryptophan, y=Tyrcsine, X=Unknown, * -Step 1 
Codon, /=possibJe nucleotide deletion, ' 
\=possible nucleotide insertion) | 








VLK0L0EMKPTV5LKKLEVHSNDPDMSVMKDISIGKATGRGQY I 


5363 

i 

1 

1 

i 

! 

i 

i 

i 

i 

i 

1 
1 

1 

i 
1 

i 

I 


8066 


TO? 


RLC7CTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGlRKDFSRRLRRIANLVATCLPVRASLPHRUxML 
RG PG PGLLLLAVLCLGTA VPSTG AS K SKROAQQMVQ PQS P V AVS 
QS KPGCYDNG KHYO I NQQWERT YLGNALVCTCYGGS RGFNCE S K 
PEAE5TCFDKYTGNTYRVGDTYERPKDSM1 WDCTCI GAGRGR 3 S 
CTI ANRCHEGGQS Y K 1 CDTWR R f U ETGGYKhECVCLGNGKGE WT 
CKPlAEKCFDHAAGTSYVVGETWL , KPyQGV;M!4VDCTCLGEGSGR 
JTCTSRKRCNDODTRTSYRIGDTW'SKKDNRGNLLQCICTGNGRG 
EVJKCERHTSV01TSSGSGPFTDVRAA\7YOPOPHPCPPPYGHCy/T 
DSGVVYSVGMOI^ * KTOGNKOMLXCTL-LGNGVSCQETAVTOTYG 
GNSNGEPCVLPFTY^RTFYSCTTEGRQDGHLWCSTTSNYECtJQ 
KYS FCTDHTVLVOTRGGNSNGAICHFPFLYTWHNYTDCTS EGRR 
DN M K W CG TTQN Y DADO K FGF C PKAAH EE I CTTNEGVN Y R I GLOW 
DKQKDMGHMMRCTCVGKGRGEWTCIAYSQLRDOCIVDDITYNVN 
DTFHKR>IEEGHHI*NCTCFGOGRGRWKCDPVI)OCODSETGTFYOI 
GDSWEKYVHGVRY0CYCYGRG3GEWHCQPL0TYPSSSGPVEVFI 
TETPSQPNSHPI OWNAPQPSK3 S KY I IjRWR PKNSVGRWKEAT 3 P 

ghlnsyt3kglkpgwyegcl3s1qqyghqevtrfdftttstst 
pvtsntwtgettpfspl.vatsesvte1tassfwswvsase7v 
sgfrveyelseegdep0ylvlps7atsv\nip\dll,pgrky3vn 
vyoisedgeoslilstsottafdafpdptvdqvddtsiwrwsr 
pqapitgyr:vyspsvegsstelnl?etansvtlsdlopgvqyn 
1tiyaveenoestpw1q0ettgtprsdtvpsprdlofvevidv 
kvtimwtppesavtgyrvdvipvnlpgehgorlplsrntf\aen 
tgi »s pgvty y fkv fav s hgr e £ k. p 1 /taqqtt kl>\ daptn lqf vn 
etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrhlqpas eytvs lvai kgnoe s pkatgvfttlqpgss ippyn 
tevtett2 v i twtpapr i gfklg vrpsoggeaprevtsdsgs 1 v 
vsgltpgvey^/ytiovlri>ooerdap\ivnk\vvtplspptnlh 
lean pdtgv ltvsw ers ttpp 1tgyr1 tttptngqognsbef. w ; 

rIADQSSCTF\DNLEVPGLEYNVSVTTVKDDKESVPISDTIIPAV 
P P PTDLRFTH / 1 LC PDTMRVTW \ A P P PS I D1»TN FLVRY SPVKNE 
GRMLOSLS 1 FFLSDN\AWLTKLLPGTEyWSVSSVYEOHESTP 
\LRGR0KTGLDSP\TG1DFS\DJTA\NSFT\VHW\IAPRA/TFI 
TGYRIR\KHPEHF\SGRPREDR\VPHSRNS1TLTNLTPGTEYW 
£ 1 VALNG R E ES P 3 GQQSTVS D V P R DLE WAATPTSLLI \ S W D 
APAVTVRYYR I TYC-ETGGNSFVCEFTVPGSKSTAT1 SGLKPGVD 
yjXTVYAVTGRGDSPASSKPISlNYRTEIDKPSQMQVTDVQDNS 
I SV KWLPSSS PVTG YRVTTTV PKNGPG\PTKTKTAGPDOTEMT I 
EGLQPlVEY\ r V5VYAONPSGES0PLVOTAVTNIDRPKGLAFTDV 
DVDSIKIAWESPgG0VSKYRVTY£SPEDGIHELFPAPDGEr:OTA ! 
ELOGLRPGSEYTVcWAl^HDDMESQPblGTOSTAIPAPTDLKFT | 
QVTPTSLSAOWTPPKVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKY^VSVYALK^TLTS^PAQGVVTTLENVSPFnR 
ARVTDATETTI TISWRTKTETITGFQVDAVPANGOTPIQRTI KP 
DVRSYTITGbOPGTDYKIYLYTLKDNARSSPWIDASTAIDAPS 
NI>RFLATTPNSLLVSW0P?RAR1TGYIIKYEKPGSPPREWPRP 
RPGVTEAT1 TGLE PGTEYTI Y V I ALKNNQKSEPLIGRKKTDELP 
QLVTLPH PNLKG PE 1 LDVPSTVO KTP FVTHPGYDTGNG I QLPGT 
SGQQPSVGQOMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGOE 
ALSQTTISWAPFOCTSEYIISCHPVGTDEEPIiQFRVPGTSTSAT 
LTGLTRGAT YN 1 1 VEALKDQQRK KVREEWTVGNSVKEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNG VW Y K I GEK WDROGENGQMMSCTCLGNGKGE FKCDP 
KEA7C Y DDG KT Y HVG EQ WQKE Y LG AT CSCTC FGGQRGWR CDN CR 
RPG^EPSPEGTTGOSYWOYSORYHORTiTrNVKCPIECFMPLDVO 

ADREDSRE 


5364 


6066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVbCIPSVPPPVPFPTLWP 
PPSWRROPPGGIRRDFSRRLRREANLVATCLPVRASLPKRL?*ML 
RGPGPGLLLbA VLCLGTAVPSTGASKSKRQAQOMVQPQS PVAVS | 
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SEC- 
ID 
NO: 


Predicted 
beginning 
nucleot ict 
location 
cor re spondi ng 
to first 
amino acid 
residue oi 
arcino acic 
sequence* 


predicted end 
nucleot ide 
3 oca tier, 
corresponding 
to fir?" 
amino acid 
residue of 
r»nuno acid 
pe cue nee 


Mino acid seGmen: containing signal peptide 
<A=A3anme, OCysteir.e, D=Aepartic Acid, tr 
Glutamic Acid, F=?henylalanine, G=Glyci:it-. , 
K^Histicane , 3 = Isoleucine, X*- Lysine, 
LiLeucine, "^Methionine, N=Acparagine , 
P=Proline, O^Glutamine, R=Arginir>e, 
SrSerine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosxne, X=Unknown, *sStcr 
Codon, /=poseible nucleotide deletion, 
\=possifcue nucleotide insertion) 








OSKPGCVDNGKHyOINOOWERTyLGNALVCTCyGGSKGFNCESK'n 

PLAEETCF DKYTGNTV RVGDTY ERPXDSM3 WDCTCI GAGRGR 1 S 

CT:ANKCHEGGOSYKlGDTWRRPHETGGYMbECVCL<^NGKGEv:T 

CKPIAEKCFDi^GTSr^GETWEKPYC<5WMMVDCTClX5EGSGR 

3TCTSRNFCNDQDTRTSYRTGDTWSKKDNRGNLL0C1CTGNGRG 

EvmCER>iTSV0TTS5GSG?FTDVRAAVY0PQPHPQPPPYGHCVT 

D£GWYSVGMOLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 

GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTShrYEQDO 

KYSFCTDHTVXVOTRGGNSNGALCHFPFLyNNJINYTDCTSEGRR 

DKMKWCGTTQN YDADOKFG FC PMAAHEE 1 CTTNEGVMYR 1 GDQW 

DKOHD^HHMRCTCVGNGRGEWTCIAYSObRDQCIVDDlTYNVK 

DTFHKRHEEGHMLNCTCFGOGRGRWKCDPVDQC0DSETGTFYOI 

GDSWEKYVHGVRYOCYCyGRGIGEWHCQPLOTYPSSSGpVEVFl 

TETPS0FNSHP10WNAPQP.SH1SKYILRWRPKNSVGRWKEAT1P 

GHIiNSVTI KGLK.PGWYEG0L1SIQQYGHQEVTRFDFTTTSTST 

FVTSNT \VTGETTPFS PLVATSESVTEI TASS FWSWVSASDTV 

SGFRVEYELSEEGDEPQyLVLPSTATSV\NIP\DIiLPGRKYlVK 

VYCISEDGEQSLILSTSOTTAPDAPPDPTVDOVDDTSIWR^SR 

PQAPITGYR1VYSPSVEGSSTELNLPETANSV7LSDLQPGVOW 

IT1 Y AVEENQESTPW3 QQETTGTPRSDTVPS PRDLOFVEVTDV 

KVTlMWTPPESAVTGYRVDVIPWLPGEHGORLPLSRNTF\AEK 

TGLSPGVTYYFKVFAVSHGRESKPLTAQOTTKL\DAPTNLOFVN 

ETD5TVLVRWTPPRA03TGYRLTVGLTRRGQPRQY.WGPSVSKY ! 

PLRNLQPASEYTVSLVA1KGNQESPKATGVFTTLQPGSSIPPYN 

TEVTETT3 V I TKTPAPR 1 GFXLG VRPSQGGE APREVTSDSGS I V i 

VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPbSPPTNLK ; 

LE/uMPDTGVLTVSWERSTTPDlTGYRITlTPTN'GOOGNSLEEW 

HADOSS CTF\ DN LBVPGLE YNVS VYTVKDDKES V P I S DT 1 1 P AV 

PFPTDLRFTN/ 1 LGPDTMRVTW\APPPSlDLTNFbVRYSPVKNE 

GKMhQShSlFFhSDN\AWLTmjhPGZEYWSVSSVYEQHESTP 

\LRGRQKTGLDSP\TGJDFS\DITA\NSFT\VHW\3APRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNS1TLTNLTPGTEYW 

SIVALNGREESPLblGOOSTVSDVPRD^EWAATPTSLLl\SWD 

APAVrVR YYR I TYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 

YTJ TV Y A VTGR G DS PASS K P I S I N YR TE I DKPS QMQV TDVQDNS 

I SVKWL?SSSPVTGYRVTTT\ PKNGPGX PTKTXTAGPDQTEMTI 

EGLO PTV EYV\'S V Y AONP SGESQPLVQTAVTN I DR PKGIoAFTD V 

DVTS3KIAWESPOGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLR PGSEYTVS WALHDDMESQPL I GTQSTAI PA PTDLKFr 

QVT PTS LS AO WTP PNVOLTG Y R VRVT PK EKTG PNKE 1 NLAPDS S 

S VWSGLMVATKY EVS VYALK0TLTS RP AQG WTTLENVSP PRR 

AR VTDAT ETT I T I S WRTKT ET I TGFQ VD AV P ANGQTP I QRT 3 XP 

DVR S Y T I TGLOPGTDYKI YLYTLNDNARS SP Wl DASTAI DA PS 

NLRFLATTPNSbLVSWCPPRARITGYI I XYEXPGSPPREWPRP 

RPGVTEATITGLEPGTEYTI YVIAI.KNNQKSEPL3 GRKJCTDELP ; 

QLVTLPH PNLHGPEI LDVPST VQKTP FVTH PG YDTGNG I QLPGT 

SGOCPSVGOOf^IFEEHGFRRTTPPTTATPIRHRPRPYPPNVGOE 

ALSQTTI SWAPFQDTSEY1 1 SCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGATYV1IVEAIjKDQ/3RHKVREEVVTVGNSVNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCOCLGFGSGHFRCD 

SSR^XHDNGWYKIGEKWDROGENGQKMSCTCUSNGXGEFKCDP 

HEATCYDDGXTYHVGEQWOKEYLGAICSCTCFGGORGWRCDNCR 

RPGGEPSPEGTTG0SYNQYSQRYHQRTNTNVNCP1ECFMPLDVC 

ADREDSRE 


S36S 


8066 


702 


RLCCTGGGEGTPGASGKRGPAATTSLVLCI PSVPP PVPFPTLWP 
PPSWRRQPPGG3RRDFSRRLRREANLVATCLPVRASLPHRLWIL 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAOOMVQPQSPVAVS 
QSKPGCYDNGKKyOINOOWERTYLGNALVCTCyGGSRGFNCESK 
PEAEETCFDK YTGNTYRVGDTYERPKDSK I WDCTC I GAGRGR I S 
CTJ AMR CHEGGOS Y K I GDTWRR PH ETGGYMLECVCLGNGKGE W7 
CKPIAEKCFDKAAGTSYWGETWEKPYQ^WI-IMVDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Fredi cted 
bcsinniag 
nuc i eotidc 
location 
correspond} no 
to first 
amino acid 
residue of 
amino acid 
sequence 


Vr e*di cted end 
nucleot i de 
iccatici. 
correspond i rig 
tc first 
r.ir,mo scic 
r< .si due of 
an-.-. no acid 
cequerjet- 


Amine acic secrr.ent containing sioiiol peptide ! 
(A=Aianinc, C=Cysteane 4 D=Aspartic Acid, fc^ 
GluLci^ic Arid, F= Phenylalanine, G=Glyc:ne, 
H-Hirtidirse . .1 = I soleucine, K=Lysint / 
L-Umcir.e , Krttet h ionine , N^Asparaoinfc 
P= Pre line, Q-- Glut ami r.e , R^Arginint , 
S=Sei ine. T=T.\r eonine , V^Valine. 
W« Tryptophan, Y^Tyrosine, XxUnknou-n, * = 5tep 
Codon, /-possible nucleotide deletion. 
\=pcssibie nucleotide insertion) 


r 




l 


irctsknrcndontrtsvrigdtwskkdnrgnllccicrgngrg 
ewkgkrhtsvqtts'sgsgpftdvraavyopqphpcpppyghcvt 
dsg'v^wsvg^oi^.^ktqgnkomlXctclgngvsccetavtotyg 
gnskgepcvlpft^grtfyscttegrqdghlwcsttsnyeodo 
kysrctdhtvlvctrggnsngaiichfpflynnh^rvtdctsegrr 
dnmkwcgtton y d/jx?kfgfcpmaahee 1 cttne3vmyr i gdow 

DKOliD^GKMMRCTCVGNGRGEWTClAYSOLRDOClVDDlTYNVN 
DTFHKRHEEGHMLNCTCFG0GRGRWKaDPVDOCQDSETGTFYC<3 
CDS W [ ;K Y VKGVR Y CCY CYGRG1 GEWHCQPLQTY PSSSG P VE VF I 
TETP£?CPNSHPIOWNAPOPSHISKYILRWRPKNSVGRWKEATIP 
GHLNPYTI KGLXFG WYEGQL1 S 1 QQYGKQEVTR FDFTTTSTST 
PVTSNTWTGETTrKSPLVATSESVTEITASSFVVSWVSASDTV 
SGFRVEYELf>EECPEPOYLVLPSTATSVNNIP\PLLPGRKYIVK 
VYQI SEDGEOSL3 t-STSOTTAPDAPPDPTVPOVDDTSn^VRWSR 
P0APITGYR1VYSPSVEGS5TELNLPETANSVT1.SDL0PGVQYN 
JTIYAVEENgESTFWJOQETTGTPRSDTVPSPRDLCFVEVTDV 
KVT1KWTFPFSAVTGYRVDVIPVNLPGEHG0RLPLSRNTF\AEK 
TGLSPGVTYYFXVFAVSHGRESKPLTAQQTTKL\DAPTNLOFVN 
ETD S T VIA/ R W T P ? RAQ 1 TG Y RLT VG LTRRG QPR 0 YNVG P S VS KY 
PLRNLCPASKyTVSl.VAl KG.VOESPKATGVFTTLOPGSSI P?YN 
TFVTETTI V3 TWTPAFR I GFKLGVRPSQGGEAPREVTSDSGS I V 
VSGl.T?GVEY\ r YTJOVLRDGOERDAP\IVNK\WTPl»SPPTNLH 
LEAKFDTGVLTVSWtRSTTPDITGYRlTTrPTNGOOGNSbEEW 
HADCr SCTP\ UNLKVPGLEYMVSVYTVKDDKESVPI SDTI 1 PAV 

PPPTM.RFTN/ : lgpdtmrvtwNapppsidltnflvrys pvkne 

GRMl.OSl.SlFn.SDNXAWLTm.LPGTEYWSVSSVYEOHEGTP 

\lrgroktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgynij?\hkpfjjf\sgrpredr\vphsrnsitlti>3ltpgteyw 

SIVALNGREES PLLI GQQSTVSDVPRD^EWAATPTSLLl \SWD 
APAVTVRYYR1TYGF.TGGNSPV0EFTVPGSKSTATISGLKPGVO 
YTITVYAVTGKGDSPASSKPISINYRrBIDXPSQMGVTDVODNS 
ISVKW1,PSS^PVTGYRVTTT\PKNGPG\PTKTKTAGPD0TEMT3 
EGLCPTVEYWS VY A0NPSGES0PLVQTAVTN1 DR PKGIJ^FTDV 
DVDSlKIAWESPOGOVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELOGI.m PGSEYTVS WALHPDMESOPLIGTQSTA: PAPTDLXFT 
QVTPT?LSAO^'TPP;JVOLTGYRVRVTPKEXTGPMKEINI.iAPDSS 
SVWSGLMVAT X Y EVS V YALKDTLTSR P AQGVVTTIjEJJV S P PRR 
ARVTDATETTITISWRTXTETITGFOVDAVPANGOTPIORTIKP 
DVRSY7 ITGIX'PGTDYKIYLYTLNDNARSSPWJ DASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARI1X5Y1IKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYT1YVIALKNNQKSEPL1GRKKTDELP 
0LVTL;mPNUJGPEIlX>VP57V0KTPFVTHPGYDTGNGI0LPGT 
SGOOPSVGOOMIFEEHGFRRTTPPTTATPIRHRPRPYPPN'VGOE 
ALSCTTISHAPFODTSEYI1SCHPVGTDEEPLOFRVPGTSTSAT 
LTGXTPGATYNIIVEALKDQQRHKVREEVVTVGNSVWEGLNQPT 
DDSCFDPYTVSHYAVGDEKBRMSESGPKLbCOCLGFGSGHFRCD 
SSRWCKDNGVNYX 1 GEXWDROGENGQNMSCTCLGNGKGEFXCDP 
HEATCY DCiGKTYKVG EQWOKEYLGA1CSCTCFGGCPGWR CDN CR 
RPGGF.PSPEGTTGOSYNQYSQRYHQRTN7NVNCP1ECFMPLDVQ 
ADREDSRE 


6366 


8066 | 703 

i 

I 


RXCCTGGGEGTPGASGKRGPAATTSLVLC1PSVPPFVPFPTLWP 
PPSWR POP PGG I RRDFS RRLR REANLVATCbPVRA.S LPH R I WML 
RGPGPGLLLLAV LCLGTAVP STG ASKSKRQAQQMVO PQS PV AVS 
OSKPGCYDNGKi^YOlNOOWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGR I S 
CT1 ANRCHEGGDSYX I GDTWRRPHETGGYMliECVCLGNGXGEWT 
CKPIAEKCFDF^.GTSYVVGETWEKPYQGWflMVDCTCLGEGSGR 
1 TCTSR>'RCNDOnTRTSYRIGI)TWSKKDNRGNL,IiCCl CTGNGRG 
EWKCERHTSVOTTSSGSGPFTDVRAAVYQPOPHPOPPPYGHCVT 
DSGWY5VGMOUA*KTOGNK0ML\CTCLGNGVSC0ETAVT0TYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRODGHLWCSTTSNYEQDO 
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~~SEQ 
ID 
MO : 


Prcd: cttm: 
bee jr.riinc- 
nuclf-et idt 
local , on 
corresponding 
to t:rc,t 
amine acic. 
residue ci 
amine acid 
sequence 


"Predicted end 
r\»c leot ide 
i ocat xoi. 
rcr respond?, ng 
to firs: 
am no acid 
residue of 
stnino acid 
stcuencp 


. . -.iiio ccir. sequent conta i r,i nc; sir.i.al pept icW~ 
k-P.'i an:nt , C=Cytteine r D-Aspart ic Acid, E» 

Glutamic Acid, Fspnenylaler. i re, G=Glycine. 

* Hi st id: ne, J ^ j soleucine , K= Lysine, 
-Leucine, M=Met hionine, K- Asparaoinc , 

•-Proline, Q^G'J v tamine, R-Argin:ne, 

. c --Serine, T^Threonine , V- Vol me, 

Tryptophan, Y-Tyrosine, X -Unknown* *=Stop 

Ccdon, / = posfible nucleotide deletion, 

*«pc5sjb1e nucleotide insertion; 








y V S r rtDHTVLVC?TRGGKlSNGALCH r PFLY1WKNYTDCTSEGRR 
I* N MXVICG TTOMY DADO K FGFC PMAA)4 FX 1 CTTNEGVMY R 3 GDQW 
l: KCHD'^GHKHRCTCVGNGHGEWTCI avsqlrdcci vdditynvn 
I.TFHKRHEECHMLNCTCFGQGRGRWKCDPVD0CQDSETGTFY01 
(OSK£KWHGV1*YQCYCYGRGIGEWHCOFLQTYPS5SGPVEVF2 
STPSQPNS H P I QWKA PQ PSH I S K Y 1 L KWR PKNS VGR W K S AT 1 P 
C : H 1 >K S YT 1 KG Jb K PG WY E GQL I S 1 00 Y GH 0 E VT R FDFTT T S TS T 

V V7S NT \V TC ; E TTPF5 PLVATSES VTE 3 TASSFWS WVSASDTV 
SC." FRVPJYEL?EEGDEPOYLVLPSTATFV\NJ P\DLLPGRKYIVN 

V V 0 1 55 EDG EOS L I LSTSOTTA PDAPPD PTVDOVDDTS I WRWS R 
K-AFITGYR 4 VYSPSVEGSSTELNLPKTANSVTLSDLOPGVQYN 
: 73 YAVEENOESTPWlQQETTGTPRfnTVPSPRDIiOFVXVTDV 
KVTI MWTPPESAVTGY RVDVI PVNLFGF.HGORLPLSRNTFXAEN 
1 GLSPGVTYY FKVFAVSHGRESXPl»TAOOTTKb\DAPTNLQFVN 
T TDSTVLV RWTP PRAO 3 TGYRLTVGLTInRCQP RO YNVG P SVSKY 
rLRNLOPASEYTVSLVAIKGNQESPKATGVrTTLOPGSSIPPYN 
T-IVTETTIVITWTPAPRIGFKLGVRPSQGGEAPSEVTSDSGSIV 
VSGLTPGVEVWTIQVJLRDGQERDAPX I VNK\ WTPLSPPTNLH 
1 LAjgpDTGVLTVSWERSTTPDITGYRITTTPTNGOOGNSLEEW 
! 1ADQS SCTF\ DNLEVFGLEYNVSVYTVKDDKESVPlSDTI I PAV 
? rTDLRFTN7lLGPCTMRVTW\APp.^51DLTKFLVRYSPVKNE 
G H KLOSLS I F FLSDN\ AWLTNLLPGTK Y WSVS S VYEOHESTP 
\LhGROKTGliUSP\TGlDFS\DlTA\WSFT\V}H«\IAPRA/TPI 
Tr-YR]R\HHFFHF\SGRPREDR\VPHSPNS1TLTNLTPGTEYW 

i : valmgreesplligoqstvsdvprbi.ewaatptslli \swd 
;.pa\rrvryyfa2tyge7ggnspvqef7vpgskstatisglkpgvd 
ytltvyavtgt<gdspasskpisinyrteidkpsymqvtdvodns 
:^vkwlpssgpvtgyrvttt\pkngpg\ptktktagpdotemti 
f. 2 loft ve yws vyaonpsgesqplvotavtn j d rpkglaftdv 
DVDS 2 kiawespogqvsryrvtysspedgi helfpapdgeedta 

ELCGLKPGSEYTVSVVALHDDMESQPLJ GTOSTAI PAPTDLXFT 
OVTPTSLSAUKTPPNVCLTGYRVRVTPKEKTGPMKEINLAPDSS 
S\ WSGLMVATKYEVS VYALKDTLTSR PAOGWTl'LENVS PPRR 
Ak VTDATET7 1 T3 SWRTXTETITGF0VDAVPANG0TP1QRTI XP 
DVR SYTI TGLOPGTDY K 1 YLYTLNCNAK SSFWI DA5TA 3 DAP S 
N'/KPLATTPttSLLVSKOFPRARXTGYJ 1 KYEKPGSPPREWPKP 
KPGVTEATITGLEPGTEYTIYVIALKTJNQKSEFL1GRKKTDELP 
CLATLPHPNliJ-IGPEILDVPSTVQKTPFVTHPGYDTGNGIOLPGT 
5. t OOPS VGOy K 3 FE EH G FR RTTP PTTAT PIRHKPRPYP PNVG QE 

AI SQTTjSWAFFODTSEYJ isotpvgtdkeplofrvpgtstsat 
ltgltrgatyn 1 1 vbalkvqqrhkvr ee wtvgns vneglnqpt 
discfdpytvekyavgdekermsesgfkllcoclgfgsghfrcd 
s s .rwchdngvnyki gekwdrc?gbngokmsctclgngkgefkcdp 

HKATCYDDGKrYHVGEOWOKEYLGAlCSCTCFGGORGNRCDNCR 
R :-GGEPSPEGTTGOSW0YS0RYH0RTHTNVNCPl ECFMPLDVQ 
ALREDSRE 


5367 


235 


3593 


k k 3 lmmlc k ks4 i v 1 e y lad i ly ey lyg f cfsg1 k k y li i hv lrl 
ilelkmtrllleksvslotqylllivk:i>swffgkemrhhlqim 
evwmbkqds / ri vgngs eqqlqkeladvlmdppmdpqpgekelv 
kksoldgegdgplsnolsasstinpvpl^vglokpemslipvkpgo 
cvs, easspftpvadedswfsklitylgcasvnaprsevealrnh 
S] lrsocoislcvtlsvfnvsegivrlldpotnteiakypiyki 

h? CVRGHDGTPESDCFAFTESHYWVELFR nrVFRCEIQEAVSRI 
LY S FATAFR R S AXQTF LS ATAAPQTPDS D I FTFS VSLE I KEDDG 

kgvfsavpkdkerocfklrogidkkiviyvoottnkelaiercf 

CLl LS PGKDVRNSDmhLDhESMGKSSVG KSYVITGS WNPKS PH 
FCVVNEETPKDKVIiFWTTAVDLVITEVOEPVRFliLETKVRVCSP 
K I F. L.FWPFSKR STTEH FFLKLKQI K0R F- R KNNTDTLr Y E WCLBS 
E5.7.RERRKTTASPSVRLP0SGS0SSVIPSPPSDDEEEDMDEPLL 
Sr-cGDVSKECA.EKILETWGEIil^SKWHT^U^/RPKOLSSLVRNGV 
FZALRGEVVIQLLAGCWWDHLVEKYRIbl TKESPQDSAI TRDIN 
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SEQ i 
ID 
NO: 


Preciactt:- 
bee ; r\r,ir.. . 
nuc. eot ; Me- 
lees tier, 
ccri espcr.diiig 
to :irst 
a:iu:;0 acid 
residue cf 
am.ir.o acid 
sequence 


Predicted and 
nucleotide 
iocat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine ?cid segment containing signsi peptiae 
(A=Au.r:ine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F«= Phenyl al an i nc , G-Giyc:nc, 
fc«=Hist 3 dane, 3«=Isoleuc ine , K=Lysine, 
L-lieuci ne , Ni=Methionine, N=Aspar aginc. 
F=Prc)ine, 0=Glutamine. R-Aruinine. 
S=Serine, T= Threonine, V=vollne, 
W= Tryptophan, Y^Tyrosine, X-Unknown. **Stop 
Codon, /^possible nucleotide deletion 
\=possible nucleotide insertion) 








RT F ? AH D V F K DTGGDGQDS LY K 3 C KAY S V Y DEE 1 G Y COGQSFLA 
AVLLLHKPEE0AFSVLVK1KFDYGLRELFKQNFEDLJ1CKFYOLE 
KLMOEYJPDLYNHFLDISLEAHKyASOKFLTLFTAKFPLYMVFH 
J J DLLLCEGI SVI FNVALGLLKTSKDOLU .TDFEGALK FFRVQL 
PKRYRSEENAKKLMELACNMK1SOKKI.KKYEKEYHTMKE00AQO 
EDPlERFERENRRLOEANMRL>EQENDDLAHEliVT£K 3 ALRKDLD 
NAEEKACALNK^LLMTKQKLIDAEEEKRRLEEESAin.KWlCRRE 
LDKAESEIKKNSSI3GDYKQICSQLSERLEKQQTANKVE1EKIR 
QKVDDCERCREr FNKEGRVKG I SSTKEVLDEDTDEE KETLKNQL 
REMELEIAQTKL\0LVEA5:CK2QD\I ) EHPF*GLPFKE\VQAA\K 
KTWFNRTLSSlXTATGVQGKETC 


53 6 8 


57T- 


70 14 


U M/vLJ tr nivKj s> LA^VjXv 1 VlLM^f Mr aA X r /\ V I r ij^rt-Uv V ill tr no 

ROAAG 3 PG 1 TPTEEKDGN LPD 1 VNSG S LH E FkVNLl I E R Y G P WS 
FWFGRRLWSLGTVDVLKOHIKPNKTLD/LF'NHAEVilKVSlW 
WWOCE*KP\ORKKLYENGVrDSLKSNFALLLXLPEELLDKWr>SY 

fetqhVvplsqhmlgfamks vtqmvmgstfeddqev 1 R eoknhg 

TWSE1GKGFLDGSLDKNMTRKK0YEDALMQLESV1,KNI I KERK 
ORNFS0H1FIDSLVQGNLNDQC3 LEDSM1 FSLASCI 3 TAKLCTW 
/\3WFI>TTSEEVgKKLYEEIN0VFGNGPVTPEKIEQLKyC0HVLC 
ETVRTAKXTPVSAOLODI EGK3 DRFI I PRETLVLYALG WLQDP 
NTVJPSPHKFDPDRFDDELVMKTFSSUGFSGTOECPELRFAYMVT 
TVLI,CVLVKRLHLLSVEGQVIETKYELVT$SREEAW3TVSKRY 


5369 


1 


662;: 


PR S LC KS LWAEAAVLADGGLRR RRRLLKGTMS AS FV F N GAS LED 
CHCNLFCLADLTGIKWKKYVWQGPTSAP1LFPVTEEDP3 LSSFS 
RCLKADV1jC?/VWRRDQRPERRE\L* 1 FWGGEDP\ VliLTJ.FTMTY 
OKKKJ / iECX;RMDFPfWAVljC?SK7\VHNIJ..ERCLriNRNF\, , R3GKWF 

vkpyekdekpikksehlscsftffl»gdsnvctsve:nchopvy 
llseehitlaqosnsffovilcpfglngtltgqafkmf dsatkk 
ijgewxofypiscclkemseekc-edmdwi-ddslaavevxvagvr 
m3ypacfvlvp0sdiptpspvgsthcsssclgvhqvpastrdpa 
mssvtltpptspeevqtvdpqsvqkwvkfssvsdgfnedstshh 
o?k3prklianhvvdrvwqecnmnra0nkrkysassggi*ceeata 
akvaswdfveatortncs clrhknlksrnag0ogqaps lgooqq 
ilpkrktnfkoeksekpokrpltpfhkrv£vsddvgmd\ads\a 
sorlv\ i sap\ds0\vrfsnir\tndvak\tp0mhgtemanspq 
ppplsp\kpcdwdegvtktpstpqsqkfyqmpt?dplvpsxpm 
eur id^lsqsfppqyqeaveptvyvgtavnleedeak 3 awkyyk 
ffkkkdveflppolpsdkfkddpvgpfgoesvtsvtelmvqcxk 
pikvspelvqqyqz knqclsaiasdaeoepk3 dpyaf"\/egdeef 
lffdkkdronsereagkkhkvedgtssvtvlsheedajtslfsps 
ikodaprptskarppstsliydsdlavsytdldnbfnsdedelt 
pgskrsangsddxasckrsktgktldplscistadlhk.myptpps 
leqk3 kgfspmnmnnkeygsmdttpggtvlegnsss1gaqfki e 
vdegfcs pkpse3 kdfsyvykpencqi lvgcsmfaplktlpsqy 

bPL3 KbPEECl YROSWTVGKliELLSSGPSWPPI XEGDGSKMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTFRGAGGPASAQGSVRYENSDLYSPASTPSTCRFbNSVEP 
ATVFS 3 PEAHSLYVNLILSESVMNLFFO)CNSDSCCICvCNMNI K 
GADVGV Y I PDPTQEAOYRCTCGFSAVWNRKFGNNSGLFFEDELD 
IJGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKL^DDLILL 
L0WTNI,FSrFGAAJ)QDPFPKSGVlSNVr^\n2ERIX:rNDCyiA 
LEHGRQFKDNMS GG KVDEALVKS SCLH P WS KJINDVSMCCSQDI L 
RMLLSLOPVLQDAI QKKRTVR PWGVC^PLTVJQQFHXMAGRGSYG 
TDESPEPLPJPTFLLGYDYDYIiVLSPFALPYWERIjMLEPYGSOR 
DI AYWLCPENEALLNGAX SF FRDLTA I Y ESCR LGQH R PVSRLb 
TDS I MR VGSTAS KKLSEKLVAE WFSOAADSNNEAFSKLKLYAOV 
CRYDLGPYTASLPLDSSLLSQPNLVAPTSOSLJTPPOM'I'NTGNA 
NTPSATLASAASSTMTVTSGVAISTSVATANSTIiTTASTSSSSS 
SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
OTSAU?TAGISGESSSLPTOPHPDVSESTMDRDKVGIPTTX?0$H 
AVTYPPAIVVyi3DPFTYEWTDESTNSSSVWTIjGl>l,RCFLEMVO 
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SEQ 1 
IE 

NC: 


" PrecHct ec 
beg i mil 
nuclcot i ac 
1 ocat ior. 
r.oriespc:idir>9 
to first 
amino acid 
residue of 
pmino acid 
sequent'* 


Predicted end 
nucleot ic* 
1 oca tic;, 
correyponrujo 
to firs- 
amino *ci<_ 
residue cl 
amino acid 
sequence 


Ammo acid segment containing signal pept ice 
(A=Alsnine, C-Cysteme, D=Acpartic Acid, E= 

Glutamic Acid, F-Phenyla lanine , G«=Glyci nc , 

K-ilist in inc, I- 1 ?oleuc2ne , X~Lysine, 
i L- Leucine, M=Methionir,e , K-Msparagint, 
. P=Prolme, Q^Glulamine , R=Arqinine. 

S=Serine, T=Threonine, V=Valine, 

^Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
1 Codon, /-possible nucleotide deletion, 

\=possible nucleotide insertion; 








TLPPHI KSTVSVQI I PCQYLLQPVKHEDRE J YPQHLKSLAFSAF 
TOCKRPLPTSTNVKTLTGFGPGLAPJETALRS PDRPECI RLYAPP 
\ Fl LA P VKD KQTELGET FG EAGQK YNVLFVG YCLSHCQK Wl LAS C 
7DLYGELLETCIINIDVPNRARRXKS5ARKFGL0KLWEWCLGLV 
OMS55LPWRW1GRLGR3GHGELKDWSCLLSRRNLQSLSKRLKCM 
CRMCG2SAADSPSILSACLVAMEPOGSFVIMPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTS CTHI L V F PTS AS VQVAS AT YTTENLDL 

a fn pnndg adgmg i fdlldtgdd ld f d 1 1 n i lpasptgs p vhs p 
gshyphggdagxgostdrllstepheevpnilqoplalgyfvst 
akagplpdvsfwsacpoaqyocplfbkaslklhvpsvosdellhs 
khshpldsnotsdvlrfvleoykalswltcdpatodrrsclpih 

FWLNQl .YNFI MNML 


5370 






RWSRKLFLRRAAOATESRPPQSOEMHPPTGKEVilALKRLRDSAN 

akdvetvoolledgadpcaaddkgr talhfascngndo I VQLLL 

PHGADPNORDGU5NTPLHLAACTMHVPVITTLLRGGARVDALDR 
AGRTPLHLAKSKLN3 LQEGHAQCLKAVR /HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWIjGSAPWSRSSCTlWSLPLHEAKCR 

2ivt?DT.cciiiirw~ , c:&Dccc;Qf~r"r\/cT*c , T m arc? Qi.KT?ii^ r rcT D\rr~ 
/*.V z<yuoz>H/\\J\3jnr uoOd V-L J »o Jo li>\Jj>icoJj£?ljr K/i^, 1 jurvL* 

GC3SWL 


5371 


1 3 ii 


li'. 


j/V^MLWkLLLRSOSCRLCSFRKMRSPPKYRPFLACFTYTTDKCS 
FKENTRTVFKLYKCSVDI RKIRR\- KDGYF* RMKPKLKKLRI/F 
LOELGADF.TAVASI LERCPEAl VCS PTAVNTQRKLWQLVC:<NEE 
E L 3 K I > 1 EOF P ES FFTI KDQENQKLN VQF PQELGLKNW ISRLLT 
AAPNVFHNPVEKNKQMVR I LQES YLDVGGSEANMKVWLLKLLSQ 
KPr I uLNSPl A1KEH I»Er IiQEQGr i or £ HAJLlvbKJjKOr Lrt>I_C 
PRSI0NS3SFSKNAFKCTDHDLK0LVLKCPALLYYSVPVLEERM 
OGLLREGIS 1 AO I RETPhP/LELTPQIVQYRI RKLNSSGYRI KEG 
KIJ\N LNGSK KEFEAN FGK I C/AKKVR PLFNPVAPLNVEE 


5372 


53 


65' 


SPGAOFLWAAPDMPDPLFSAVOGKDEILHKALCFCPWLGKGGME 
FLRLLILLFVTELSGAHNTTVFOGVAGQSLQVSCPYDSMKHWGR 
R KAWCRQLGEKG PC0RWSTHNLWLLSFLRRWNGSTA3 TDDTL-G 
GTLTITIJ^NLOPHDAGLYQCOSLHGSEADTLRKVLVEVIJVDPLD 
HKl)AGDLWFPG\DLRASRM?MW5TAS?GASVfKEK?;PSHPL?SFS 
9 W P AS FS SR F * OP AP SGLQPGMDRSOGH I H PVNWTVAMTOG I SS 
KLCOG 


5373 


2834 


34< 


VKKTKSIFNSAMOEMEVYVENIKRKFGVFNYSPFRTPYTPNSOY 
CMXLDPTNPSAGTAK3 DKOEKVKliNFDMTASPKI LMSKPVLSGG 
7GRRISLSDMPRSPMSTNSSVHTGSDVE0DAEKKATSSHFSASE 
ES M DFLDKS TAS P AST KTGOAGS LSG S PKPFSPQLSAP I TTKTD 
KTS'TrGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLJSPKRO 
IRS R FQLNL DKT I ES CKAQLG I NE I S EDVY TAVEHSD S EDS E KS 
PSSDSEYISDDEQKS*GTSOEDTEDKEGC0MDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
FKP I KDKI/KGKDETDS PTVHLGLDSDSE\NELiVI DLGEDHSGRE 
GRKNKKEPKEPSPKODWGKTPPSTTVGSKSPPETPVLTRSSAO 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE\TAP 
AVORSCGTSSTV0QKE1T0SPSTSTITLVTST0SSPLVTSSGSM 
STLVS S VNGDLP I GTASADVAAD1 AKYTSKL\ MDAIKGTM\TEI 
YNDLS KN\TTWKAQLAEDSQGLR1 EI EKXQWLHQQElASEMKHN 
LELTMAEMRQSWEOERDRLIAEVKKOLELEKQOAVDETKXKOWC 
AKFKKEAIFYCCwm , SYCDYPCQ\OAHWPEK\MKSCTOSArAPQ 

\oeadae\vntetlnkssogsssstosapsetasa\skeketsa 
ekskesgstldlsgsretpssillgsnogsdhsr\skkssv?sss 
dekrgs\trsdhn/tpstohgrsllpgkesragtpfu;tsk 


5374 


2814 


34f 


VKKTKS I FN S AMQ EM£ VY V EN I R R K FG V FN YS P FRTPYTPNS Q Y 
OWLLOPTNFSAGTAK I DKQEK VKLNFDMTAS PKI LMSKFVLS GG 
TGRJ? I SLSDMPRS PMSTNSSVHTGSDVEOI3AEKXATSSHFS ASE 
ESMDFLDK$TASPASTKTGQA6SLSGSPKPFSPQI»SAP3TTKTD 
KTSTTGS I LNLNLORSKAEMDLKELS ES VQQQSTP VPLI S PKRQ 
J R£ R FQJMbVKT I ESCKAQ LG I NE I S E0VYTA VEHSDS EDS E KS 
DS SDSBY1 SDDEQKS +GT$QEErTEDKEGCQMDK£PSAVKKKPKP 
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Amino acid sec^crvt ccr.taining signal peptide 
(A=Alanine, C=Cysteme, D-Aspartic Acid, £=- 1 
Glutamic Acid, F- Phenylal anine , G-olycine, ! 
KrHistidine. 3 = 1 scJeucine, X^bysine, j 
L= Leucine, M=Met hionine, N=Asparaginc , j 
P«=Proline, G>Ga ut amine , R^Aroininc, 1 
S-Serine, T« Threonine , V-Valine, | 
W=Tiyptophan, Y=Tyrosine, X= Unknown, ««Scop j 
Codon, /.-possible nucleotide deletion, 
\=poss.ible nucleotide insertion; 


i 

! 

! 

• 

• 

! 

■ 




TNPVE I KEKLKSTSFAS EKADPGAViOKAS PEPSKDrSGKAKPS 
PHPI XDXbXGXDSTPS PTVHLGbDSDSE\NEbVl DbGEDHSGRE 
GRKWKKEFKEPSPKODWGKTPPSTTVGSHSPPBTPVLTRSSAO 

tsaagatattstsstvtvtapapaatgspvkkorpllpke\ta.p 
avqrscgtsstvcx?keit0spststitlvtstossplvtssgsm 
stl^ssvngdlpigtasadv^aadiakytsklxmdaikgtmvtel 
yndls kn\tt w xao/laedscgbr 1 e i e k l>qwbhqqel\s emkhn 
beltmaemkqsweoerdrllaevxkqlebekqoavdetkkkqwc 
anfkkeaifyccwm7sycdypc0\0ahwpbjamksct0satap0 
\0eadae\vw?etbnkssogsssstosap5etasa\skeketsa 

EKSKESGSTLDLSGSRETPSSIbbGSNC<3SDHSR\SNKSSWSS£ 
DEXRGS\TRSDKN/TPSTCHGRSLLPGKESRAGTPFLGTSX 


S37S 2907 

! 

! 

j 
i 

i 




H3F1ASEEPMLERKCRGPI,AMGPA0P»LLSGPS0ESPQTLGKES 
RGbRQQGTSVA\OSGAOAPGRAHRCAHCRRHFPGWVA\bWLHTP 
KCOA/RGLPLPCPECGRRFRflAPFLAbHPQVHAAATPDWGFACH 
LCG OS FRGWVAbVLHLRAHSAAXAGP ?ACP KMAR DAFWR RKAAS 
S£l LRRCHPSRPRGPRPFICGNCGRSI bPTWDQ/bKVAHKRVHV 
SRRP*ERGPPAXVFWGPRPRGPPTGDTPPGPGGDAVDRPF\OCA 
CCGKRFRHK\PNLIRS}£AACrSGERPHO/CSRECG\KRFTNKPY j 
LTS\HRRITHTAROPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
A0AAPGPGSPQLPAGPOE?AAEPTPAVPLKPAQEPPPGAPPEHP ' 
0DP1FAPPSLYSCDDCGRSFRLERFLRAH0RQHTGERPFTCAEC 
GKNFGKKTHLVAHSRVHSGERPFRLARKCGRRFLPRASOSGGRN 
SAEPNAPRFCPFVCPDCGKAFPHXPYbAAHRPIATPAEXPYVCP \ 
DCKKAFSQXSNb\VSKRR3HTGERPYACPDCDRSFSQKSNLITl! . 
RKSHI RDGAFCCAI CGQTFDDEERLbAHQKKKDV 


5376 


4504 


5 9: 


VSTFSLCLWPAGGGGRGRVSMMAOSKRl-iVYSRTPSGSKMSAEAS 
ARPLRVGSRVEViGKGHRG'lVAYVGATbFATGKWVGVILDEAKG 
KNDGTVQGRX Y FTCDEGHG I FVRQSQ 1 QV F EDGADTTS PET PDS 
SASXVLXREGTDTTAKTSKLRGLKPKXAPTARKTTTRRPKPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAP11PTP 
VLTSPGAVPPLPSPSXEEEGLRAQVRDbEEKLETLRbKRAEDKA 
KbKELEKHKIQLEQVCEHKSX^IOEOOADbORRbKEARKEAKEAb 
EAXER YMEEMADTADA I EMATbDXEMAEER AES LOQEVEAbXER 
VDELTTDbElLKAEIEEKGSDGAASSYObKQLEEONAKLKDAl.V 
RMRDLSSSEKOEHVK\L0RbMEXXWOELEVVRQ0RERLOEELSO 
AESTIDELXE0VDAA1.GAEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDEbQENARETEbELRBQbDMAGARVREAQXRVEAA 
OETVADYCXJTlKXYROLTAHbODVMRELTNQOEASVERQQQPPP 
ETFDFKI KFAETKAHAKA 3 EMELRQMEVAQAKRHMSLLTAFMPD 
SFLRPGGDHDCVLVblbMPRLI CXAEL I RKQAQEKFEbS ENCSE 
R PG LRG AAGEOLS F AA3 G b VY \ S bM P AAGHR YHR Y * CHAL SOCR 
LDWYXXVGSbYPEMSAHERSLDFLI ELLHKDQLDETVNVEPLT 
KAI KY YOHLYS 1 HUVEOPEDCTMQLAEHI KFTQSAbDCMSVEVG 
RLRAFbQGGQEATDIAbLLRDLETSCSVDIRQFCKXIRRHMPGT 
DAPG 2 PAALAPGPQVSDTbbDCRXlibTWVVAVLOEVAAAAAQbl 
APLAENEGbLVAALEELAFKASEOIYGTPSSSPYECLROSCNIb 
I STMNX\bVTAMOEGEYDAERPPSKPPP\\rEbRAAAbRAE 3 TDA 
EGLGbKbEDRETVI KEbXXSbXl XGEEbSEANVRbTbbEKXbDS 
AAXDADER I E KVQTR LE ETQAbbRKXE XEFEETMDAbQAD I D0L 
EAEKAELKORLMSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GAI PGQAPG5 VPG PGLVXDS P bbbQQ I SAMRbH 3 SQbQHENS 3 b 
XGAOMKASLASbPPbHVAKLSHEGPGSEbPAGALYRXTSQbbET 
bNQbSTHTHWDlTRTSPAAXSPSAObMEQVAObXSbSDTVEKb 
KDFA^KETVSORPGATVPTDFATFPSSAF^RAKEEOODDTVYMG 
KVTFSCAAGFGORHRLVLTQEObHOLHSRbl S 


5377 


7S2 


110£ 


DVPCKRVbPAEAQEKGOLTbSCGESGEEGXF* YHEVRQAEGES * 
/WFGPNVRbVHTQLKTKXPSGTbXAXFYLHTGSTKFAARlSCTX 
SS ♦ KPGYDGWKGGQY 1 F3 FRGMRWEEQP 


5373 


2009 


664 


OASGTTbRPbFDbPQLKJlREATSRNRAbKJ^RGRbVljMTSCbPAb 
RF3 ATPRbSAMPHI DN 0V KbD F KD VbbR P XRSTLKS R S E VDbTR 
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sequence 


~>.rna~r»o*cc; d ^Cpnent containing signeX^pept ice " 
(A=Alar.me, C = Cyr.t ei tie , D=Aspartic Acid, F : 
Glutamic Acad, F= Phenyl alanine , G=Glycinc. 
H-Hl stid: r.e , I = 1 sol eucine, K=Lysine, 
L= Leucine, f<-- Methionine, N^Asparaginc , 
P=Prcline, 0" Glut amine, R-Arginine, 
S-Serine, T=T:ireonine, V« Valine, 
W=Tryptophar., Y=.Tyxosine, X = UnXnowr., *=Stop 
Codon, /=possible nucleotide deletion, ' 
\=possibie nucleotide insertion) 






• 


SFSFRNSXQTYSGVPI 1 AANMDTVGTFHMAKVLX:KS*VPGSFWD , 
V p QMGCV FL I Y KL FT L K KM LL L S V LL P A $ 3 L V AEK FS L F TAW 
KiiYSLVQWOEr AGONPDCLEHLAASSGTaSSDFEOLEQI LEAJ T 
OVKYICLDVANGV<:r;HF\rEFVKDVRKKFPC>HriMAGNVVrGEMV i 
EEL I LSGAJDI I KVGIGPGSVCTTRKKTGVGYPQLSAX'MECADAA = 
HGLKGH1 1SDGGC?CPGDVAKAFGAGADFVMLGGMLAGKSESGG i 
ELI ERDGKKYKLF YGMSS* I \AM\XKYAGGVAEYRASEGKTVEV j 
PFKGDVtHTIRDI^CGIKSTCTYVGAAKLKELSRKTTFJRVTOO 
VNP1FSEAC 


S37S 


2009 


G64 


OASGtTLPPLPDlPOLKRREATSRNRALKPRGRLVLMTSCLPAL j 
RFIATPRLSAMFHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTE ] 
SFSFRNSKQTYSGVPI lAANMDTVGTFEMAKVLCKS* VPGSFWP i 
VPOMGCVFL I YKL FTLKWKKLLLS VLLPAS I LVAEKFSLr TAVH ' 
KilYSLVOWOEFAGQNPDCbEHlAASSGTGSSDFEQIiEOILEAJ V 
Q V K Y I CL I) VANG Y S EH F V F> F V K I J VR K R F POKT I MAG NWTGEMV 
EELI LSGADI 1 KVGIGPGSVCTTRKKTGVGYPObSAVMECADAA ! 
HGLKGH3 3 SDGGCS CPGDVAKAFGAGADFVMLGGMLAGKSESGG , 
EL I ERDG K K Y KLFYGMSS * I \ AM\ KK YAGG VAE YRAS EGKTVE V 
PFKGDVEHTIKDJLGGIRSTCTYVGA^KLKELSRRTTFIKVTOO 
VNPIFSEAC 


5380 


2 


2050 


PSRAGGAERGR.AAAARSPGGSAAGWECPSVLDEAGACTMSSCVS j 
SOPSSNRAAPOOELGGRGSSSSESgKPCEALRGLSSLSIHr-GME 
SF1WTECEPGCAVDLGIARURFLEADGQEVPLDTSGS0ARPHL 
SGRKLSLOERSC^IAAGGSLDMNGRCICPSLPYSPVSSPOSSP 1 
RLPRRPTVESHHVS1TGMODC\'OLNQYTLKDEIGKGSYGWKLA 
YN DNT Y Y AM KV 1 5 K X KL 3 RQAAF PRR PPPRGTR PA PGG C 1 0 P 
RG P I \ EOVYQE 1 A\ 1 LKKLXttiPN W\KLVEVL\DDPNF.DHLYMV i 

F\ elvncc pvmevttlkplsedoarfyfodl: KGIE ylhyoki 3 

H\RDIKPSNLLVGKDGHIKIADFGVSNEFKGSDALL5NTVGTPA 
FMAPESLSE7RX3 FSGKALDVWAMGVTLYCFVFG*CFFMPERIM ! 
CLHSXIKS0A1.EFPD0PDIAEDLKJDLITRM1.DKNPESRIWPE3 ; 
KLHPWVTRHGAi?PLPSEDSNCTLVEVTEEEVENSVKHl PSLATV j 
ILVKTMJRKRSFGNPFEGSRRBERSLSAPGNLLTKKPTRECESL 
SELKT* KXSPLPACCKVT* EFPKPSGCRPSCWQPPFLHTHSQPR 

♦fepprtdealcpyetgrtcwapllq^-wwvgtplpfplstswl 

PDLVGAPGSHFCFLNIALLRYNSKTM 


53 81 


? 


2050 


FSRAGGAE KGR AAA AR S PGSSAAGWECPS VLDEAGACTMS S CVS 
SQPSSNRAAPQDE1CGRGSSSSESQKPCEALRGLSSLS3HLGME 

SF1 vvtece pgca vdlglardr pleadgqevpldtsgsoar phl 
sgrklsloersogglaaggsldnngrcicpslpyspvsspossf 

RLPRRPTVESHHVSITGMQDCVOLNQYTLKDEIGKGSYGWKJLA 
YNENDNTYYAMKVLSKKKL1RQAAFPRRPPPRGTRPAPGGCIQF 
RGPI \ EQVYOE I A\ 3 LKKLDHPNW\KliVEVLVDDPNEDHLYMV 
F\ELVNOGPVMEVPTLXPLSEDOARFYF0DL3KGIEYLHYQKII ; 

hnrd: KPSNLLVGEPGHI xi adfgvsnefkgsdallsntvgtpa j 

FKAPESLSETRK I FSGXALDVWAMGVTLYCFVFG* CPFMDER1K 
CLHSX I KSOALE FPDQPD I AEDLXDL I TRMLDKNPESR I WPEI i 
KLHPWVTRHGAEPLPSEDENCTLVEVTEEEVENSVKH3 PSLATV ; 
3LVKTMIRKRSFGNFFEGSRREERSLSAPGNLLTKKPTRECESL : 
SELKT* Kl S PLPACCKVT* EFPHPSGCRPSCWQPPFLH7HSQPK j 
* PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 1 
PDLVGAPGSHFCr LNIALLRYNSHTM - 


5382 


1536 


203 


GARGSOODAPALCEABVRGPERA0PARGRMTKARLFRLWLVJL.GS ] 
VFMILLIIvyWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE : 
LTADSDVDEFLDXFLSAGVXOSDLPRKETE0PPAPGSKBESVRG 
YDWSPRDARRSPDQGROOAERRSVLRGFCANSSLAFPTXERPFD 
DI PNSELSHLI VDDKHGAI YCYVPKVACTNWKRVMI VLSGSLLH 
RGAPY RDPLR I PKEHVHNA^AHLTFTaKFWRRYGKLSRWLMKVXX 
KKYTKFLFVRDP FVRLI S AFRSKFELENEEF/ ♦ PQVRRAKAAAV 
RCPHOPARLGARGLPRWPOWSFANF IQYLLDPHT2KLAP FNEK 
WRQVY RLiCHPCQ I D7DFVGXLETLDEDAA0LLQLLOVDLAAPLP 
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Predicted end 
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iocat i on 
cox responding 
to first 
amine acid 
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amino acic 
recuence 


Ami no acid seoment containing signal peptide ] 
(A=Ainnine , C^Cysteme, D^Aspartic Acid, E- 
Gluianac Acic, F= Phenylalanine, G=G3ycine. 
H-Histidme, l = lr>oj eucine, K=Lysint . 
L=Leucme, M*-Methior.ine , N*-Aspar agin*. . 
P^Proline, y-G3utanr.ine, R=Arginine, 
S^Serine, T=Threomne, V-Valine, 
Wr-. Tryptophan, Y-Tyrosine, X=Unknowr. , * = Stcp 
Codon, /"possible nucleotide deletion. 
\=possibie nucleotide insertion) 




i " 


FELPGTGPPSSWEEDWFAK3 PLAWRQOLY KLYEADFVLFGYPKP " 
ENLLRD 


53 83 


45 


525C 


VERLbGCRNS KRTKRMLI S KNMPWHRXOG I SrGMY SAEELKKLS j 
VKS1TNFRYLDSLGNPSANGLYDLALGPADSKEVCSTCV0DFSW 
CSGHLGIIIEbPLTVYNPLLFDKLYLLLRGSCLNCHKLTCPRAVJ 
HbbLCQLRVbEVGALQAVYELERILSRFLEENADPSASEIREEL 
EQ : TTE I VONKLLG S OGAHV KNV CES KSKL1 ALFWKAKMN AKR C 
PHCKTGRSVV^KEIINSKLTITFPAI^VHRTAGQKDSEPLGI SEAQ 
KJKKGx-LTPTSAREHLSAUnJKMEGFFLNYLFSGMDDDgMESRFN 
PSVFFLDFLWPPSRSRPVSRLGDOMFTNGOTVNLC'AVMKDWL 
I RKLLALMAOE0KLPEEVATPTTDEEKDSL I AIDRS PLSTLPGO 
SL3DKLYNIWIRLQSHVN1VFDSEMDKLMMDKYPGIRQILEKKE j 
GLFRKHMKGKRVDYAARSV1 CPDMY1NTNE1GI PMVFATXLTYP 
OP^PWNVOELROAVING.^WHPGASMVINEDGSRTALSAVDMT 
CREAVAKOLLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 1 
RPSIOAHRAR: LPEEKVLRLHYANCKAYNADFDGDEKNAHFPQS j 
ELGRAEAYVU^CTDQQYLVPKDGQPLAGLIODHMVSGASMTTRG 
CFFTR EHYMELVYRGLTDKVGRVXLLSPS 2 LXPFPLWTGKOWS 

tll1ni i peuh1plnlpgkak1tgkawvketprsvpgfnpdsmc 
esqvi i regeblggvldkahygssayglvkccyei yggetsgkv 
ltclar lfta ylql yrg ftlgv ed 1 lvkp kadvkrqr 1 1eesth 
cgpqavraalnlpeaasydivrgxwqdahlgxdqrdfnm] di>kf 
keevnhysneinkacmpfgi.hrqfpent1.qiwvosgakgstvnt ■ 
mqjs(_"iit>g01 elegrstp1,masgkslpcfepyeftpraggfvtg 
rfltg] kp p e f ffh ckag r eglvdtav ktsrsg y lqr c 1 1 khle 
glwoydltvrdsdgsw0flygedgldipktqflqpk0fpfla 
snyevimksohlhevlsradpkkalhhfra1kkwoskjifntl1.r 
rgafe5;ydokioeavkalklesenrngr/rpwds/g/rmlrmwy 
eldeesrrkyqkkaaacpdpslsvwrpdiyfasvsetfetkvdd 
y sqe waaqt e x s y e xs els ldrlrtllql \ kwq r s bce pg e a vg 
llaaos 1 geps70mtl^tfhfagrgemmvtl<gi prl3e3lmvas 

ANIKTPMMS V P VLNTKKALXRVKSbXKOLTRVCLGEVLQKl DVQ 

esfcmeekonkfqvydlrfoflpkayyooekclrpedilrfmet 
rffkl.lmes 1 kxk)wkasafp^vntrratordldnagebgrsrg 
eoegdeeeeghlvdaeaeegdadasdakrkekqeeevdyeseee 
eeregeenddedmc?eernphregarktoeqdeevgl/gk*ggpv 
psrppdaapethp0pgapga\eamerrvqavre1hpf1ddy0yd 
tees l w c q vtv kl plm k i n fdmss lws lahgav i yat kg i tr c 
llnettiwnknekelvlntegjnlpelfkyaevldlfrlysndih 
a i antyg 1 iaalrv 1 ske1 kdvfavygi avdprhlslvadymcf 
egvykflnr fg 3 rsnssploomtfets fqflkoatklgshdelr 
s psaclwgkwrggtglfelkqplr 


53 84 


196 


886 


OSCGORLPTVL*L*GP?GSCPClbSLF\PGRPHALPEIRPYlNI 
TILKGDKGDPGPMGLPGYMGREGPOGEPGPQGSKGDKGEMGSPG 
APCOKRFFAFSVGRKTALF.SGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGI YFFSLNVHS WWYXETYVHIMHNQKEAVI LYAQPS 
ERS1MQS0SVWLDLAYGDRVVA/RLFKRQRENAIYSNDFDTYITF 
SGHL1 XAEDD 


5385 


326 


799 


LMVPRTKXEAJ'APPKAEAJCAKAliNKAKKAVLKDVKSHKKNKlHNl 
SPTFRRP XTL* LR RQPKY PWXSTPRRNKLDHHVI 1 KF PLTTE * A 
VXKI EKNSLbVFTVPVXANKHOI XQAVXX/LCDIDVA K VWTLJQ 
SDGERXAY VR LAPDYDALWATKIGI T 


5386 


326 


799 

- 


bWPRTKX£APAPPKAFJVKAKAb\KAXKAVLKDVHSHXK3>JKIHK« 
SPTFRR P XTL * LRRQPKYPWKSTPRRNKLDHHVI I KFPLTTE* A 
VXXI E7^SLLVFTVDVKANXK0IX0AVKJ(/LCDIDVAXVNTLJ0 
SDGERXAYVRLAFDYDALWATXIGIT 


5387 


2 

- 


2117 


FWA^GGCW FVLGER RAGSLLS AS YGTFAMPGMVLFGRR WAJ A 
SDDLVFPGFFELWRVLWWI GI LTLYLMKRGKLDCAGGAbLSSY 
lilVLMlUAWlCTVSAlMCVSMRGTICNPGPRKSMSKLbYIRL 
ALFFPEM VWASI^AA WADGVQCDRTWNGI I ATVWSW 1 1 1 AA 
TVVSI 1 1 VFPPLGGKMAPYSSAGPSHLDSHDSSObLNGLKrAAT 
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PredicieC 
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Amino acid segment ccr.raininc tjdhai peot:ide 
(A:A]ar.ine, C=Cyste;ne, D=Aspartic Acid, E^ ' 
Glutamic Acid. F=Phenyl alanine , G-Glycinc, 
H=H:stiCine, 1 = Isolevjcine , K^Lyninc, 
L=Leuc:>ne, ^Methionine , N=A$par acint , 
F=Prol .-a ne, Q^Gl ut amine , K^Ary £ ni ne t 
Ss Serine, T=Threonine, V=Valint, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon. /^possible nucleotide deletion. 
Vpossible nucleotide insertion; 








SVWET^IKLLCCCIGKi>DHTRVAFSSTAELFS?YFSnfDLVPSD 
3AAGLALLH0Q0DNIRKN02PA0WCHAPGSS0EADLDAELKNC 
HHYMOF.^AfAYGWPLYlYRKPLTCljCRIGGDCCRSKNPOTMT/K 
VGGDOLQL/CTSAPILHTHRAAVOGLHPRCLPWTSFTELPFLVA 
LDHRK ES WVAVRGTMS LQVVhTDLSAES E VLDVECEVQDR LAH 
KG 1 SOAAR YVYQRLINDGI LSCAFS I APEYRLV J V5HS LGGGAA 
ALLATMVRAAY PQVRC YAPS P PRC LWSKALQE Y SOS F 1 VS L.VLG 
KDV3 PRI .SVTNLEDLKRRI LR WAHCNKPK YK I LLHGLWYELFG 
GNFNNl>PTELDGGDOEVLTOPLLGEOSLI,THV?SPAYSFSSDSPL 
DSSPKYFPLYPPGRIIHLOEEGASGRFGCCSAAHYSAKWSHEAE 
FSXILlGPKJibTDHMPDIiiMRALDSWSDRAACVSCPAOGVSSV 
DVA 


5388 


35fc<r 


753 


TATXJG AGGGGRROAGVRRH Y LY P FTGG YR K RKAACQAER PAARS 
XDTDIJVAYOKGNLGVOLRNMAOETNHSOVPMLCSTGC^FYGNPR 
TNGMC S V CY KEHLQRQNS SNGR ISP PVQCTDGS V ?E AC S ALDST 
SSSMOPSPVSNQSLLSESVASSOI'DsrSVDKAVPETEEVQASVS 
DTAQOPSEEOSKSliE\NRNKKRIAVSCAGRKWDbLGLNAGVEMF 
TWYTVrQMYTIALTITKOMLKNFVFOOEFKSFGSFHCOLLEYK 
1LEHL0TKN 


5389 


15< y- 


753 


ta£)ggagg<;grroagvrrhylypftggyrkrraa?oaerp/u\rs 
kdtdlaayokgnlgvclrnmaoetmisovpmlcstgcgfygnpr 
tngmcs vc ykehlorqnssngr i sr p vocrrxic VP EAOSALDS T 

SSEMOPSPVSNQSLLSESVASSO^BSTSVDKAVPETEDVOASVS 
DTA00 PS EEC S KSbE\NRNK XR I AVSCAGR KWDbLGbNAGVEMF 

ILEHLQTKN 


5350 


21'/ 


1332 


EDPRK1 >MEDKMWS ECEG P EM S LVCLTD r'QAHAR EO LS KSTRDFI 
EGGADDSITRDDNIAAFKR1RLRPRYLRDVSEVDTRTTI0GF.EI 
SAP I CI APTGFHCLVWPDCEMSTARAAQAA\G J CY I TSTFASCS 
LED I V I AAPEG LRW FQL.Y VH PDLQbN KQL1 OR VESLG FKAbV I T 
LDTPVCGNRRHOIRNQLRRNLTI.TDLQSPKKGNAXPYFQMTPIS 
TSLCVWDLSWFOS I TRIiP 3 1 LKG1LTKEDAELAVKH WVCX31 1 VS 
NHGGROl'DEVLASIDAbTEWAAVKGKIEVYLDGGVRTGKDVLK 
ALALGAXC1 FLGDAILWAbASKGEHGVKEVLtn bTNEFHTSMA\ 
LTCCRS VAE I NRNLVQFSRL 


5391 




1292 


VKKA/iGRSRGPPTAGGQRCEEAPGTVMERRLGVRAWVKENRGSF 
OPPVCNKLMHOE0LKVMFVGGPNTRKDYHIEEGEEVFYOLEGDM 
VLRVLEOGKHRDWIRQGE1 FbbFARVPHSPORFANTVGLWER 
RRLETELDGLi^YYVGDTMDVLFEKWFYCKDLGTOLAPlIOEFFS 
SEOYRTGKPIPDObLKEPPFPbSTRSJMEPMSLDAWLDSHHREI* 
QAGTPLSbFGDTYETOVIAYGOGSSEGbRONVDWILWOLEGSSV 
VTMGGRULSLGPWMDSLLVl>SWGPSY\AW\ERTOGSVAl>SVT\Q 
DPACKKSPWGEPSCHGLKAATGVPSTbEVPSLPNNSPSPHYLSV 
YCRCVPHRPAHCCHPPSCPSOPRCHAPGRAAAPHbLWOTQPTAL 
PVLP0GLPPAPLLP1 PLSLQTQCSTSTPRRPS3 KAS 


5392 




1623 


IRGSNAOKWGASGSGGACPQPDPAGPGGVPAIAAAVLGACEPR 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEK1RKGS 
FIHKPAHGWLHPDARVLGPGVS Y WRYMGCI EVLRSMRSLDFOT 
R TQVTR EA INRLHEAVPGVRGS WKXKA PNKALAS VLGKSNL RFA 
GMSI £ 1 H I STDGbSLSVPATRQVI ANHHMPS 1 S FASGGDTDMTt) 
Y VAYVAKPPI NQRACHILECCEGL\AOS 1 1 STVGC-AFEbRFKQY 
LHSPPKVAbPPERLAGPEESAWGDEEDSLEHT^ YTNS I PGKEPPL 
GGbVDSRLiALTOPCALTALDOG PS P5L RDACS L P WD VGSTGTA P 
PGDGY VQADARGP PDHF.EHLYVNTQGLDAPEPEDSPK KDLFDMR 
PFEDALKJUHECSVAAGVTAAPLPLEDQVJPSPPTRRAPVAPTEEQ 
LROEPWYHGRMSRPJVASRMLRADGDFLVRDSVrNPGOYVLTGMH 
AGOPKKLLLVDPEGWRTKDVLFES I SHL I DKHbQtfGQPI VAAE 
SEbHLRGWSREP 


5393 




982 


GGPSAGMTMET0MSQ^CPRNLWLlX?PI>TVbLLXASADSOAAAP 
PKAVLKbEP P WINVLQ\ EDSVTLTCQGAPG P /ERS PS Z QW FHNG 
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BNSDOCID: <WO __0153312A1_I_> 



WO 01/5331? 



PCT/LS0IV34M3 



SEQ 
ID 
NO: 


Predicted 
begj nninc 
nucieot icU 
locatio:. 
corresponds nq 
to first 
amino acic 
residue c.< 
amino acic 
sequencr 


"predicted end 1 
nucieot ide 
locat ion 
corresponding 
to first 
amino ncic 
residue ol 
amino acic 
sequence. 


Amino acid sepment containing sional pept'ioe 
(A=Alanine, C-Cysteine, 0=As?ortic Acid, E^- 
Glutamic Acid, F- Phenyl a! ana ne , G=Giycine, 
K=Hi0t..icine, I-lcoleucine, K-Lysane 
L' Leucine, M=Methionine , H-Asparaoint, 
P=ProHne, 0=Glutomine, R=Aro;nii)e, 
5 = Sen ne, T-Thzeoni;je, V=Valjnt, 
v.^- Tryptophan, Y-- Tyrosine, X=Unknown, *=stop 
Codon, /=poRs\ble nucleotide deleticr., 
\ipossible nucleotide insertion; 








\NLIPTHT0PS\YRFKANNN\DSGEYfCOTGOfSL\SDPVHLTV 
I ,SEWLVLCT PH LE FOEGETI MLRCHS \XRDX P\ LVKVTFFQNGK 
S0KFSHUDPTFSIPQWHSHSGDYHCTGN1GYTLFSSKPVT1TV 
OV VSMGSSS PMG II VAW1 ATA VAAI VAAWAL I YCR KKR I SAN 
STDPVK.AAOFEPPGROMIAlRKRQLEtTNWDYETADGGYMTLNP 
RAPTDDDKKI YXTLPPNDHVNSNN 


S394 






GGDSAGMTMETQKSONVCPR>lLWLLOPbTVlLLLLASADSOAAAP 
FKAVLKLEPPWI ^,0\EDSVTLTCQGAPQP/ERSDS 3 QWFHNG 
\NLIPTHTOPS\YRFKANWN\DSGEYTCOTGOTSL\SDPVHLTV 
ESEWLVLCTPHEEF0EGE7IMLRCHS\WRDKP\LVKVTFF0NGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGN 1G YTLFSSKPVTI TV 
OVPKMGSSSPMG J I VAWZ ATAVAATVAAWALI YCRKKRISAN 
STDPVKAAOFEPPGROMIAIRKRQLEETNTNDYETADGGYMTLNF 
RAPTDDDKNIYLTLPPNDHVNSNK 


$395 


313E 




p^.sdaknqegLlntrrkstdsvpiskstlsrslsloasdpdgas 

fSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKXOTTK 
KPTETPPVKETOOEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLE P AAG PX AACPLDSES VEG W PF ASGGGRVQNS PP VG 
KKTI.PLITAPEAGEVTPSDSGGQEDSPAKGHSVR^EFDYSEDKS 
SlsDNOCFNPPPTKKlGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDI PTAKGTYTFDI DKWDDPNFNPFSSTSKMOESPKL 
P^OSYUFDPDTCDESVDPFXTSSKTPSSPSKSPASFEIPASAME 
ANGX'DGDGLNKPAKKKKTPLKTDTFRVKICSFKRSPIjSDPPSODP 
1 paatpetppvl s avvhatdeeklavtnqkwtcmtvdleadkod 

vpopsdlstfvnetkfss pteeldyrnsye2 eymek3 gsslpqd 
ddapkxqalylmfdtsqespvksspvrmsesptpcsgssfeete 
alvntaaknohpvprglapnqeshlovpskssokeleamglgtp 
sea i e3 tmegs fasadaiilsrlahpvslcgalpylepdlaexn 
f plfaqk lqr raah ptdvs i s ktaly srj gtae v ek pagllfqq 
pdldsalol araei itkerevsewkdky eesrrewiemrki vae 
yektlaqmiedeorsk£vs\hotvqolvlekeqa\ladi>nsvek 
\ sladlfr r y e kmkevleg fr kneevlkrcaqey lsr vkkeeor 
y0alkvha\ eek lpranae\ 1 aovrgxac?qepaahoaslaer s5 

CRWDALERTLEOKNKEIEELTKICDELIAKKGKS 


5396 


313' 


53: 


RASDAXJJOEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKXPRPPSLKKKQTTK 
KPTETPPVKETOOEPDEESLVPSGENLASETKTESAKTEGPSPA 
LuEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGOEDSPAKGKSVRLEFDYSEPXS 
S KDNOOENPPFTXX j G KK P V AKMPLR R P KM KKTP EKLDNT PAS ? 
PRSPAEPNDI PI AXGTYTFD1DKWDDPNFNPFSSTSKMOESPKL 
FQQSYNFDPDTCDESVDPFKTSS KTPS S PS KSPA'SFEI PASAME 
ANGVDGDG LNK PAX KKKTPLKTDTFRVKXS PKRS PLSDPPS QD? 
TPAATPETPPVISAWHATDEEKLAVTNOKWTCMTVDLEADKQD 
YPQPSDLSTFVNETXFSSPTEELDYRNSYEIEYMEK3GSSLPQD 
CDAPKXOALYLMFDTSQESPVKSSPVKMSESPTPCSGSSFEETE 
AbVrn-AAK^OHPVPRGLAPNOESHLOVPEXSSQKELEAMGLGTP 
S5AIEITAPEGSFASADALLSRLAHPVSLCGALDYLEPDLAEKN 
F PLFAOKLOREAAH PTDVS 3 SXTALYSR 1 GTAE VEKPAGLLFQQ 
PDLDSALQIARAEIITXEREVSEWKDXYEESRKEVMEMRKIVAE 
YEKT1A0MIEDE0REXSVS\HQTVQQLVLEXEQA\LADLNSVEX 
NSLADLFRRYTKMKXVLEGFRKWSBVLKRCAOEYIiSRVKKEEQR 
YQALKVHA\EEKLDRANAE\3A0VRGXAQ0S0AAH0ASLAERSS 
CRVNDALERTLEOKNXEIEELTKICDELIAKMGKS 


5397 


3135 


53*1 


RA S DA ION0EGLLNTRRXJSTDS V P I SKSTLSR S LSLQAS U FLXJAS 
SPGNPEAVALAPDAYSTGSSSASSTLKRTXXPRPPSLX3CK0TTX 
XPTETPPVKETQOEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVONSPPVG 
RXTLPLTTAPEAGEVTPSDSGGOEDSPAKGHSVRLEFPYSEDKS 
SWDr^OQENPPPTKKIGXKPVAKMPLRRPKMKXTPEKLPMTPASP 
P 3 S P AEPND I P IAKGTY TFDl DKWDDPN FN P F SST SKMQESP KL 
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BNSDOClD: <WO 01S3312A1J. > 



WO 111/53312 



PCT/tS00/342f>3 



r SEC 

rr 

NC: 


Predicted 
beginning 
nucleotide 
locate or. 
coire^pcndi ng 
to rim. 
ammo acid 
residue of 
ammo ?cid 
sequence 


Predicted end 
nuc*] eotiur 
1 oca C i on 
corresponcanc 

amine acir. 
rescue oi 
air.inc ac:c 
sequence 


A.-riino acid segment containing siqnal peptide 
(A-A^anine, C& Cysteine.-, D=Aspartic Acid, E= 
Gl\:t amic Acid, F= Phenyl a j anine , G=Glycine, 
H = H:> st idine, l rlsolcucme , K=Lysine*, 
L-Leucine f M=Methion i ne , N-Asparagi ne , 
P- Proline, 0«Glutamirifc, K-Argir.int , 
S-Scnne, T=Tnreonine, VaValant, 
W-1 ryptophsn, Y^Tyrosine, X=UnkriOwn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








pocsynfdphtcdts'vdprktssktpsspskspasfeipasame 
ang\tx;dglwkpakkkktplktdtfp.vkkspkksplsdp?sodp 
tpaatpetppvi sawhatdefj kiavtnqkwtcmtvdleadkqd 
yfcpsdlstf^tfetxfsspteeldyrns yeieymektgsslpqd 
ddapkk0alylmfdtsqespvk5spvrmsesptpcsgssfeete 
alwtaaknohpyprglapnoeshlqypekssokeiieamglgtp 
sea 2 e i tapecs fas adalls r lajip vsjxgaldyle pdlaexn 
? p l.faq k lqre a7vh ptdvs j s k ta l y s r i gtae ve kpagll fqo 
pdlds alqi arae 1 1 tkerevs £^kdxyeesrr evmemrk i vae 

YEKTlAOHlEt>EQREKSVS\H0TV0OLVl,EKE0A\IADLNSVEK 
VSLADLFRRYEKMKEVLEGFRK^EEVLKRCAOEYLSSVKKEEQR 
YCALK VKA\EEKLDRANAE \ I AOVRGXAQQEQAAHOASLAERSS 
CRVNDALERTLEOKNKEIEELTKICDELIAKNGKS 




St 


542t 


SGE V CRMESNF NQEGV PRPSYV FSADP 1 ARPSE 3 N r DG 1 KLDLS 
HEFSLVAPNTEANSFESKDYLQVCUURPFTQSEKELESEGCVH 
ILDSOTWLKEPOCILGRLSEK£SG\OM\AOKFSFFPGFLGPAT 
TOKEFFOGCJMHPXvKDLiKGOS^LlFTYGJUTNSGKTYTFOGTE 
EN 1 Rl LPRTLNVLFDSbOEKMTKMNLKPHRSRBYLRLSSEQEK 
EE 1 ASKSALLR01 KEVTVHNDSPDT1 YGS LTNSLNI SEFEES I K 
DYEOANLNMANS I XFSVWVSFFE-I YNEY I YDLFVPVSSKFQKKK 
MLR LS0DVKGY S F 1 KDLQW 1 QVS DS K RAY RLLKLG I XJ1QS VAFT 
KLNN'ASSRSHSl FTVKI LCI EDSF.MSRV1 RVSELSLCDLAGSER 
TMKTONEGERLRETGNINTSI.LT^GKCINVLKNSEKSKFOOHVP 
FR E S K LTH YF/QS F FNG KG K 3 CM 1 VN I S 0 C Y LA Y DET LNVLK F$ 
A3 A0KVCVPDTL.KSS0EKLFGFVXSS0DVSLDSNSNSK3 LNV7CR 
AT I S WENS LEDLKEDEDIjVeELEN AEETED / VGETKLLDEDLDK 
7LEFNKAFISKEEKRKLLDL1 FDLKKKLI NEXKEXLTLEFKI RE 
EVTQEFTOYWAORE/vDFKETLLOE^ElLEENAERRLAIFKDLVG 
KCD1-REEAAXD1CATKVETEEATACLELXFNCIKAEUUCTKGEL 
1KTKEELKKRENESDSLIOFLETSNKKJITONORIKELINJIDQ 
KEnTINEFONLKSHMENTFKCNDKADTSSLl INNKLl CNETV f EV 
PKDSKSKlCSHlRKRVNENELOODEPPAKKGSIKVSSAirEDOKK 
S EE VR PN 2 AE I ED ] R VLQENNEC 1 »RAFLI»TI ESELXNE KE EKAE 
LNKCIVHFOQELSLSEKKKLTLSKEVOOIOSrTYDlAIAELHVQK 
S KNOEOEE KI MKLS NE I ETATRS I TNNVSQI KLMKTK I DELRTL 
DSVSOISNIDLLNLRDZ^NGSEEDNLPNTOLULbGNDYLVSKQV 
KEYRIOEPNRENSFHSSIEAIWEF.CKE3VKASSKKSHQ1EELEQ 
0IEKL0AEVKGYKDENWRLKEKFKKN0DDLLKEKETL100LKEE 
LQEiCWTLDVO I QKVVEGXRALS ELTQGVTC YXAXI K ELETI LE 
TQXVERSHSAXLEQD3 LEKES1 J LXLERNl.KEFQEKLQDSVKNT 
KDLNVKELKLKEEI TQLTNNLQDMKJiLLQLKEE EEETNRQETEK 
LK EELS AS SARTQN\ LNADLORKEED YAOLKE KLTDAKKOI KQV 
OKEVSVMRDEDKLLR3KINELEKK)^OCSOELDMKOR\TICX3LK 
EOLINQKVEEA10OYERACKDLNVKEKI 3 EDMRMTLEEQEQTQV 
EODQVL\BAKLSEVERIATEl^RWRVXCTroLEriaWQRSNKEHE 
NNTDVLGKLTNLODELOES EQK YN ADR K XWLEEKMML I TQAKEA 
ENI RNXEMKKYAEDRERFFXQONEKE I LTAOLTEKDSDLQKWRE 
ERDQLVAALE I OLXALI S SNVOKDNE I EQLKR 1 1 SETS XI ETQI 
MDI KPKR1 SSADPDKLQTEPLSTS FE I SRNXI EDGS WLDSCEV 
STErJDOSTRFPXPELElOFTPLOPNKMAVKHPGCTTPVTVKlPK 
ARXRKSNEMEEDLVKCENKKNATFRTNLKFP3SDORNSSVKKEQ 
KV A I K P S S XK : Y SLR SQAS 1 1 G VN LAT KKXEGT LQXFGDFajO" S 
PS3LQSKAXK1IETMSSSXLSNVEASKENVSQPKRAKRXLYTSE 
1SSPJ D3SGQVIL,ViC0KMKESDK01 IXRRLRTXTAX 


5399 


70S 


230 


GPR MAXFLSQDQ I NEYKECFSLYDKQQRGX1 XATDttlVAMR CLG 
ASPTPGEVORHLOTHGI IX5NGEL DFSTFXT 3 NHKOI KQEDPXKE 
ILLAMLMVDKEKKGYA^KASDtiRSKLTSLGEXLTllKEV\DDLFRE 
\ADI EPNGKVKYDEFIHXITSYXDGTY 


5400 


933 


246 


SHCSSGKE1PP7NYFASRAAJLVA0NY I NYOOGTPHRVFEVQXVK 
CASKEDI PGRGRXYKiKFAVEEl 2 C-XQV3CWCTA2VLYPSTGQE 
TAPEVKFTFEGETGKNPDEEDNTFYQRLKSMKEPLEAONl\PDN J 
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BNSDOCIO: <WO__0153312A1_L> 



WO 01/5331: 



PCT/USOO/34263 



SEC- 
ID 
NO: 


Predicted 
bCQi nninc 
niJcl.eoti.ee 
locaticr. 
corresponding 
to firs's. 
amine acic 
residue ct 
amino acic 
sequence 


Predicted end 
nucj cot id* 
i oca t ion 
cor respondinc 
to first 
amino acic 
residue of 
amino acid 
sequence 


.Vcino acic secment containing rjcnal peptide '• 
(A= Alanine, C=Cyste;.ne , D=Aspsrt. ic Acid, E= 
Glutamic Acid, F= Phenyl al anl ne , G=G]ycine, 
H^Histidine, I = Isolevscine, K=Lysinc, 
Ls-i.eucine, MeMethsonine, N=Asparagme , 
P*Proline, Q=G1 -Jtamine, RrArgimne, 
S=Serine, T> Threonine, V-Va]inc, 
K=Tryptophan, Y-Tyrosine, X^Unkncwn, *=Stop 
Codor,, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion} 








FGNVSPEMTLVLHLAKVACGY I IWQNSTFXTWK>WKIQTVKQV 
QR^DFI Ei^DYTILLHNIASQEl IPWQMQVLWflPQYGTKVKHNS 
RLPKEVQLE 


54 oa 




1360 


TGWS YGPTTSLAFLAPR DF P FPPKLL1H PQAWR LS CGAGSMGS 
0AAAFWRNWASWEGSSSLSGCSMGCFKDDR1VFKTWMFSTYFME 
KKAFRODDMLFYVRRKIAYSGSESGADGRKAAEFEVF-VEVYRRD 

skklpglgdpdidweesvclnlil0kldymvtcavc7radggdi 
h1hk-kksq0vfaspskhpmdskgeesk1sypniffmidsf\be\ 
vfsdmtvgkgemvcvelvasdktntfcgvifogsiryealkkvy 
dwpvsvaarmaqk\msfgfskysnmef\vr\k,kgpogkghaema 
vsrvstgdtspcgteedssfaspmhervtsfstpptpernnrpa 
ffspslkrkvprnriaemkkshsandseeffkrtddggadlhkat 
nlrsrslsgtgrslvgswlklnradgnfllyahbtyvtlpiihri 
ltd i i.evrokpilmt 


5402 


3445 


1562 


GKCFJ MAAVV00NDLVFEFA.cNVMEDER0'J/;nPAI FPAVI VEH V 
PGADJ LNS YAGLACVEEPNDK J TESSLDVAEEE I IDDDDDDI TL 
TVEASCHDGDET1 ET1 EAAEALLNI4DSPGPMLDEKR1NNNI FSS 
PEDDMWAPVTHVSVTLDGIPE^ETOOVOEKYADSPGASSPEQ 

Di/DVirrovTvDDDDncDiTTDMi (sui/ifVMviirvrMTlYt VJirirT t. 
f KKKf^KK JL KFPKrUbrA I 1 trr* l 0 VMVlsJ^ KI^j JV\j?n HI liWfcr LiLi 

allodkatcpkyikwtqrekgifklvdskpvsrlwrkhknkpxd 
mnyepmg ralry y yorg i lakvegqr lvycfk empkdl 1 y i nde 
dps s £ i essdpslsss atsnrnotsrsrvs s s fgvkggattvlk 
fc^skaakpkdpvevaqpsevl.rtvoptospyptolfrtvhvvo 
pvqavpegeaartstmodetlmssvosir\t:oaptovpvwsp 
kjv'qo \ lhtvtlgrvplttv i astdps agtgsqk f i lqa jpssqp 
kt vl kenvmlqsqkags pps ivlgparv \00 v ltsnvqt i cngt 
vsv\asspsfs\atapwtl,fllgss0lvaiippgtv3tsvikt0 

ETKTLTOEVEKKESEDHLKENTEKTEOOPQPVVM-^SSSNGFTS 
C'VAM KC/NEL.LEPNS F 


5403 


3445 


1563 


gecfj maawoondlvfefasnvhedesqlc;dpai fpavi vehv 
pgad1 lksyaglacveepndmitessldvaef.e1 1 dddddditl, 
tveaschdcdetietieaaeallnmdspgpmldexrinnni fss 
peddmwapvthvsvtldgipevmetqovcekyadspgasspeq 

PKftKKf:RKTKPPT?Pn^P7\TTPNI9VKKKNKDGKGNTI YIiWEFIjL 

aliotk atcp k y i kwtqrekg i f kl vds kpvs r lwr kh kn k p\ d 
mnyepmgralryyyorgilakveguklvyqkkempkdliyinde 
dpfsslessdpslsssatsnrnotsrsrvssfpgvkggattvlk 
pgnsxaakpkdpvevaqpsevbrtvqptosfyptolfrtvhvvo 
pvoavpegeaartstmqdetlnssvosir\tioaptovpwvsp 
rnqq\lhtvtlqtvplttvi astdpsagtgsqkf: lqai pssqp 
mtvlkenvmlosqkagspps i vlgparv\qovltsnvqt 2 cngt 
vsv\asspsfs\atapwtlfblgssolvahppgtvitsviktq 
etktltoevekkesedhlkentekteqopopyvmvvsssngfts 
ovamkonellepnsf 


5404 


187 


1113 


LPVTLI FAKMKTLOSTLLLLLLVPLl KPAPPTQQDSRIIYDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI I PNEKSLQLQKDEAI 
TPLFPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARPNK I KKLT\AKDFAD J PNLRRLDFTGNLI EDI EDGTFSKI* 
SLVEELSLAENOLLrKLPVLPPKLTLFNAXYNKIKSRGIKANAFK 
KIKNLTFLYLDHNAbES VPLNL.PESLR V I HLQFNNI AS 1 TDDTF 
CKANDTSYTRDR1EEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


i 2199 


1220 


0NSRSLHMDP0NQHGSGSSLWICX>PSLDSRPRLDYERE1QPTA 
ILSLDOIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHEII 
P 2 NVKNN YEHRHTSHLGHAVLPSNARGPI LS RS TS TGS AASSGS 
NSSASSEOGLLGRSPPTRPVPGHRSERA3RT0PK0LIVDDLXGS 
LKEDLTOHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 
£rT/EYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSOSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDKJHRPGCRCKNS 
N7VYCKLESCPSRG0GKPS 


5406 : 21S 


2732 


RWRTYWEGPLTFWDVAIEFCLEEWOCLDTA00NLYRNVMLENY 
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BNSDOC1D: <WO 0153312A1J.* 



WO 01/53312 



PCT/lJSt)t>/3426:< 



SEg 

IL 

NO: 


Preci cted 
beginning 
nucleotide 
1 oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
r.uc ieotict 
j ocat ior. 
c c r respond my 
tc firs', 
aouno aeic 
rcsicuc ci 
atr.ino acic 


hnino acid scc^en; containing Sic :;s } peptia<~~ 
(A=Alanine, C^Cys-^eine, D^Aspartic Acid, E= 
Giuranuc Acic, F- Phenyl a 3 ani ne , G-Glycine, 
H=Histidine, ]=;scjpt:cjne, K-Lysinc, 
L^Leucine, M--Kc-t hi cnine , N=Asparngint , 
?=Prolinc, G>G"j ut amine , R=Arginine, 
5 = Serine, T=T.m eon i tie , V= Valine, 
W- Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








KNLVFLG/IIAVSKFCL'TCLEOEkEPWEPKRRJlEMVAKPPVKC 
£HFTODFWPEOHZKDFFC.KATLP.RYXNCEHKNVHLKKDHKSVDE 
CKVHRGGYWGr-'KOCl.rATCSKIFLFDXCVKAFMKFSKSNRHKlS 
HTEKKLFKCJCECGKFrCK-.SHLAQHKIIHTRVNr-CXCEXCGKAF 
NCPS 1 1 TKHKRI NTGEK PYTCEECG KVFNWSSR LTTHK KNYTRY 
KLYKCEECGKAFNKSS 1 LTTHK 2 3 RTGEKFYXCXECAKAFNQSS 
N).TEHKKIHPGEKPYKCFECGKAFm<JPSTLTKHK^lHTGFKPYT 
CFECGKAFNQFSNLTJ KKR I HTA\£KFYKCTECCEAFSRS\SNL 
TX1WEIHTEKKPYKCEECCKAFKWSSKLTEHKLTHTGEKPYKCE 
KCGKAFNCPSI ITfOtNRIKTGEKPYTCEECGKVFNWSSPLTTHK 
KNYTRYKL.Y KCEBCGKAFNKSSILTTHKK1K3EKXFYKCEECGK 
AFKWSSKLTEHK1 THTGEKPY KCEECGKAFNHFSI LTK11KRIH7 
GEKPYKCEECGKAhTgSSNI»n>:KKJH'l'GEKFyKCEECGKAFrC 
5ENLTTHKKIHTGGKPYKCEECGKAFN0FSTLTXKKIIHTEEKF 
YKCEECGKAFKWSSTi,TKHKIIHTGEKPYKCEECG\KAFKL£ST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKK1HTGE0PYKC 
EECGKAFNYSSHLNTHKR 1 1JTKE0PYKCKECGKAFN0YSNLTTH 
NK 1 HTGEKl/VKPEDVTV 3 LTTPCT FSN IK 


S407 


3 




RFRRROSSCCTGKL^GKLLRAAPRFCRRTETDMEQGKGLAVLIL 
AI 1 LLQGTLAQS I KGNHLV KVYDYOEDGSVLLTCDAEAKNITWF 
KDGKM3GFLTEDXKKWNLGSNAKDPRGMYOCKGSONKSKPLOVY 
Y RMCQN CIELNAATI f. G FLFAE 1 VS 1 FDLAVGV YF1 AGTGME FR 
OS \RASDKQTLLP VKDFAV70PLKDPR KMTQY£HLQGN\OI»RRN 


5406 

1 


214 b 


6128 


OGSKGTCHPOAOOPWOEGVWOEAPSQSEPWGOSOEPPTKPOR^P 
HAR QHTPLPLGSAD Y R RWS VR POGPHKDPKDSR DAAKREQGS L 
APRPVPASRGGXTLCKGYRCAPPGPPAQFQRFICSASPPWASRF 
STPCPGGAVREDTYPVGTOGVPSLALAQGGPGGSWRFLEWKSMP 
RLPTDl.DIGGPWFPJJYCFKRSCW\niA2SOEDOLATCWOAEHCGE 
VRNKDMS WPEEMS FI ANSSK I DRHKVPTEKGATGLSNLGNTCFM 
NSS30CVSNTQPLTQYF3SGRHI.YELNRTNP1GMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFOOODSQELLAFL 
LDGLHEDLNRVHEKI'YVELKDSDGRPDWEVAAEAV7DMHLRRNRS 
IWDLFHGOLRS0VKCKTCGHISVRFDPFNFL>SLPLPMDSYMHL 
ElTVIKbDG*ITPVRYGLRLMMDEKYTGLKKQbSDl>CGLNSEQIb 
LAEVHG SN2KNFPODNOKVR1.SVSGFLCAKE1PVPVSPJSASSP 
TQTDFS S S PSTNEM FTLTTNGDLPR P I FI PNGMPN7 WPCGTEX 
NFTNGMVNGHKPS LPDS F > TG Y 1 1 AVK R XMMRTELY FLSS0KN R 
PSLFGMPLIVPCTVHTRKKPLYDAVWIQVSRLASPLPPOEASMH 
AQrx:DD£MGyQYPFTLRW0KDGNSCAWCPWYRFCRGCKIDCX5E 
DR,IlF3GNAYIAVDWHP7 v AIHI>RYOTSCERWDEHESVEOSRRAO 
VEPIKLDSCLRAFlSEEELGENEMYYCSKCKTHCIiATKXI^DliVm 
LPPILI IHLKRP0FVNGRW1 KSOKIVKFPRESFDPSAFLVPRDP 
ALCQH K PLTPQGDELS F PR 3 h AR E VKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSFKGSPSSSRKSGTSCPSSKNSSPKSSPRTW3RS 
XGRLRLPQIGSKNKLS S S KE NLDAS KENGAGQI CELADALSRGH 
VLGGSQPELVTPQDf-EVALANGFLYEHEACGNGCGNGYSNGQLG 
NHSEEDSTDDQREDTK I KPI YNLYAISCHSGILGGGHYVTYAKN 
PNCKWYCYNDSSCKELHPDE3DTDSAYILFYEQ0G1DYA0FLPK 
TDG KKMADTS SMPE D F E SDY \ EK YC VLO 


S4Q9 


2745 


612* 


OG S KGTCH PQACOPW^FGVKQEAPSQSB PWGQS QEPPTMPORLP 
HAROHTPLPI^SADYRRVVSVRpCGPHRDPW>SRDAAKREOGSL 
APRPVPASRGGKTIiCKGYROAPPGPPAOFQRPICSASPPWASRF 
STPCPGGAVREDTTY PVGTOGV PSLAlAQGGPOGSWRFbEWKSMP 
RLPTDLDI GG PWFPH Y D FERS CWVRAI SQEDQLATCWQAEHCGE 
VRNKDKSWPEEMSFI AKSSKI DRHKVPTEKGATGLSNLGNTCFN 
NSSIOCVSNTQPLTQY FI SGRHLYELNRTNPIGMKGHMAKCYGD 
LVOELWSGTQXNVAPLKLRVgTlAKYAPRPNGFQQODSQEUAFX, 
LtX^LKEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
I VVDLFKGQLRSQV K C KTCGH 3 S VR FDP FN FLSL PLP MDS YMHL 
EITVI KXDGTTP VK YG LR L WDEKYTGLKKOLSDLCGLNSEQI h 
IJ^\rHGSNIK>JFP0DKQKVRLSVSGFbCAFElPVPVSP3SASSP 
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>- 

ID 
NO- 


beginning 
micleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predictec end 
nucl eot idt 
loc?!t i on 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino riC2d se-ciTent containing siynal pert ice , 
(AtAic-r.ine, e=Cyr,teme, D=Aspartic Acid, 1 
Glut antic: Acid, F = Phenyl si anme , G-Clyc:n«. , \ 
HrHiftidine, 1= 3 soleucine , K-Lysine, 
L=Le\)c:r)C, M---Ke t hionine , N-Aspareg int , | 
P-Pro)ine, 0=Giutaminc, R=Arginine, 
S=Sei;r ( e, T-Thiconne, V=Valine, 
w=Trypt cphan, Y = Tyrosine, X=Uriknown, *=Stop 
Codon, / = possible nucleotide deletion, 
\=possible nucleotide insertion) 








TQTDFF S SPSTNEKFTLTTNGDLPR P 1 VI PNGM PNTWPCGTEK 
NFTNGI-r^GlIMPSLPDSPFTGYlIAVHRKMMRTELYFLSSOKNR 
PSLFG M P L I V P CT VKTR K KDLY DAVW I QVS R LAS P LP PQEA S NK 
AQDCDD SMGYOY P FTLR WQKDGNS CAWCPW YR FCRGCK 1 DCGE 
DRAF3GKAY3 AVDWliPTALHLiRYQTSQKRWtJEHESVECSRRAO 
VEP1NLDSCLRAFTSEEELGENEMYYCSKCKTKCLATKKLDLWR 
LPP 3 LI 1 HLKRFOFVNGRW 1 KSQX1 VKFPRES KDPSAFI.VPRDP 
ALCQHKPLTFQGDELSEPRILAREVKXVDAQSSAGEEDVI.LSKS 
PSSLSANIISSFKGSPSSSRKSGTSCPSSXflSSPNSSPRTLGRS 
KGRLRLPOIGSKNKLSSSKENLDASKENGAGOICELADALSRGH 
VLGGSC'PFLVTPOOHEVALANGFLYEHEACGNGCGNGYSNGOLG 
NHSEEDSTDDOREDTR 1 KP I YNLYAI SCHSG1 LGGGHY VTYAKN , 
PNCKWYCYNDSSCKELHPOE1 DTDSAYI LFYEQOGI DYAQFLPK 
TDGKKK7uOTSSMDEDFESDY\EKYCVLO 1 


5410 

i 
i 
i 
| 


2 


71C 


LRFPGOARHVWLAARMOAPHKEHLYKLliVIGDLGVGKTS J J KRY 
VH0NFSG:iYRP»TlGVDFALKVLHWDPETWRL0LKD3AG0ERFG 
NMTRVYYKEAMGAFIVFDVTRPATFEAVAKWKKDLDSKLSLPNG 
XPVSWLLANKCDOGKDVIjMNNGLKWDOFCKEHGFVGWFETSAK 
eninileasrclvkhilanecdlmes1epdvvkphltstkvasc 
SG\CAK 1 LVG'J'KAGVK 


1 5411 

1 

1 

! 


1302 


2 89 


TGPAAAG^RKALGSFGKPSPVTGLRAARRRRTRPSAPAJ^PSVGC 
GKRRES DAGACGERASVRTGSGRRGGRTKIAGDS EQTLQNHQOPN 
GGEFFLIGVSGGTASGKSSVCAKIVQLLGONEVDYROKQWILS 
ODSFYR VLTSFOKAKALKGOFNFDKPDAFDNELI LKTLKE J TEG 
KTVOIPVYDFVSHSRKEETVTVYPADVVLFEGlliArYSOER/IR 
DLFQMKL FVDTD ADT RLS RR VLKDI S E RGRDLEQ 1 LSS; KT LR ? V 
XPA\PEEFC1iPPK\KYADVTIPR\GADN\RVP1NLIVCH10\D1 
LNGGPS \NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


OGISNFFKKEANFKFEVSGYLISPLRSPFVDFALEWSLKiASPK'N 
KMEGE S £ R KE I H T P VSDX XX K K CS I H X E R PQXH S H E I FR DS S L V 
NEQSQITR^KXRKKDFOHLISSPLKKSRICDETANATSTLKKRX 
KRRYSALEVDEE/vGVTWliVDKENIKNTPKHFRKDVDWCVDMS 
JEC/KLPR K \ PKTDKFOVLAKSH\AHKSEALHSKVREKKNKKHOR 
KAASKESO^A\RF)TLPOSEFPTOEESWLS VGPGGEI TELP \ ASA 
HKNKSKKKKKXSSNREYET\LAMPEGSQAGREAGTDMOESOPTV 
GLDDETP0LLGPTHKKKSKXXKXXKSNHOEFESLAMPEGEOVGS 
E VGADMC 555 \ R P AVG LHGETAG X P APAY KNKSKKKKKK SNKO F. F 
EAVAMPESLESAYPEGSOVGSEVGTVEGSTALKGFKESKSTKKX 
SKKRXLTSVKRAKYSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VXS RPR 0 K KTQAC1AS KHVQEAPRLEP ANEEHNVETAEDSE 3 R Y 
LSADSGDADDSDADLGSAVXQLQEFI PN3 KDRATST3 XRMYRDD 
LERFKEFKAOGVAJKFGKFSVKENKQLEXMVEDFLALTGIESAD 
KLLYTDKYPEEKSVlTNLKRRYSFRLK2G\RNIAJ5PWXLlyYRA 
KKMFDVNKYKGRYSEGDTEKLKMYHSLLGNDWKT3GEMVARRSL 
SVALKFSOlSSQFuNRGAWSXSETRKLIKAVEEVlLKKFlSPQELX 
EVDSXLg EM F ESCLS I VREK LY KG I SWVEVEAKVQTRNVJMQC KS 
XWTEIL7 KR MTNGRR I YYGKNALRAKVSLI ERLYE I NVSDTNE I 
DWEDLASAIGDVPPSYVOTKFSRLKAVYVPFMOKXTFPEIIDYL 
YETTLPLLKEKLEKMriEXXGTKIQTPAAPKQVFPFRDIFYYEDD 
SEGGGHR KR KRRPRRHAWFTPV I PVLWEAXAGWI 1 


5413 


3753 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 
TPLLNGAGPGAAROSPRSALFRVGHMSSVKLDDELLEP\DMDPP 
HPFPKEIPKNEKLLSLKYESLDYDNSENQLFLEEERRINHTAFR 
TVE I KR WV I CAL 3 G I LTGLVACFI Dl W ENLAGLKYRV I KGN 1 D 
XFTEKGGLS FSLLLWATLNAAFVLVGSVIVAFI EPVAAGSGI PQ 
IKCFLNGVKI PH\ATRLKTLV3 KVSGVILSWGGLAVGKEGPK3H 
SGSVIAAGISQGRSTSLKRDFKI FEYLRRDTEKRDFVSAGAAAG 
VSAAFGAP VGGVXFSLEEGAS FWNQFLTWRI FFASMI STFTLN F 
VLSrYHGNMWDLSSPGLINFGRFDSEKMAYTIHEIPVFIAMGW 
GG VLGAV r NALN YWLTMFRIRY I HRPCLQVI EAVLVAAVTATVA 
FVLlYSSRDCOPL^CGGSMSYPLOLFCADGEYNSKiAAAFFNTPEK 
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SEC 
ID 

NO: 


Preen ct ed 
beg inn: ng 
nuciieot iae 
locat j on 
corit ^ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted r.nd 
nucleotict 
location 
corresponc mg 
to first 
amino acic 
residue oi 
amino acic 

ot-UucilLC 


Am: r>c acic segment containing fzujnai peptide 
(A=Aianine. C=Cysteine, D^-Aspartic Acic, E = 
Glutamic Acid, F-?heny!l ai anine , G-Glycine, 
H=H:stidine, 1 = 1 soieuc: ne , K-=Lycinf. , 
h= Leo cine, M=Mpthioninc , NrA3p&rr ; p)ne, 
P-Proline, 0=Gl utamine , R-Argimnt, 
5= Serine , T=Threonine, V- Valine, 
W= Tryptophan, Y=Tyrcsirie, X- Unknown, »rSt_op 

\-pcssible nucleotide insertion) j 








SWSLFHDPPGSYNPLTLGLFTLVYFFIACWTYG^TVSAGVFIP 
SLL : GAAKGRL7G I SLSY LTGAA I WAD PG K Y ALMGAAAObGG I V 
RKTLSLTVIMMEATSNVTYGFP5MLVLMTAKlVGDVF]EGbYT>M 
H1OL0SVPFLHWEAFVTSHSLTAREVMSTPVTCLRRREXVGVIV 
DVLtSDT ASKKI1G FPVV EKAEOTOP ARUCG1>I 1 »R SQL i VbbKHXV 
FVERSNLGLV0RRLRLKDFRDAYPRFPPIOSIHVS0DERECTKD 
bSEKKNPSPYTVPQEASLPBVPKbFRALGLRHL/JWDNRMCWG 
LVTRKDLARYRbGKRGLESbSLAQT 


S41C 


2130 


390 


GVASAWDRALFSPbLSPTSRVFRTSPPRCVSTETGRRDRARVPS 
QriCSVLOGKLPVSGRTSLACVRSILLSPASSPRXVGIVGGTGAR 
AGAAPRDHGRTOHRRPSSARRMTRTTGOCLAPRGCOGPRGTRSP 
RSPRSR TRRGCS AS PACLP/CR5AL I VAVbC Y I N J .LNYMDRFTV 
AC-VLPD1E0FFNIGDSS5GLI0TVF1SSYMVLAPVFGYLGDRYN 
RXYLMCGGI AFWSLVTLGSSFI PGEHFWLLbLTRGIA/GVGEASY 
ST I APTLI ADLFVATCRSRMLS 1 FY FAT PVGSGLGY I AGS XV KD 
MAGDWHWALKVTPGLGWAVLLLFI.V^/REPPRGAVERHSDl.PPL 
NPTSWWADLRALJ^iRNPSFVjSSLGFTAVAFVTGSIALVAPAFUL 
RSRWLGETPPCLPGDSCSSSDSLJ FGLITCLTGVbGVGbGVEI 
SRRLRHSNPRADPLVCATGUiGSAPFLFLSLACAKGSlVATYlF 
IFIGETLbSMNWAIVADILLYVVJ PTRtfSTAEAFQI VESKLLGD 
AGS PY hi GUI SnRLRRNWPPSFLSEFRALQFSLtt^CAFVGALGG 
AAFLGTAHbH 


541 £, 


<>9< 


2966 


IPFKTKLr:bOKH\LTTLT\NQECATlFCEV0KLRPRNE0REMEL 
I J SFLRCLFIlBKOKEHIH 2 GEMKOTSCMAAEN J GSELFPSATRF 
RLDMLKNKAKRSLTESLESIbSRGNKA^GLOEHS 1 SVDLUSSLS 
STLFNTSKEPSVCEKEAbPISESSFKLbGSSEDLSSDSESHLPE 
EPAFLS POQAFRRRANTLSHr PI ECQE?PQ FANGS PGVSQRKLM 
RYHSVSTETPKE«Kr3FESKANHLGDSGGTPVXTRRHSWR00IFL 
RVATP0KACDESSRYEDYSELGELPPRSPLEPVCE3GPFGPPPE 
EKKRTSKELRELWOKAILQQILLLRMEXENOKLOASENDLLNKR 
LKLDYEE1TPCLKEVTTVWEX>JLSTPGRSKIKFDMEKKHSAVGQ 
GVP\rvHHKGEIWKFLA£OFHLKHCFPSKQOPKDVPYKELbXOLT 
S00HA7 LIDLGRTFPTHPYFSAQLGAGQLSLYN1 LKAYSLLDQS 
VGYCCGLSFVAGlLLbHMSEEEAFKMbXFL^FPMGLRKQYKPDM 
1 1 LC 3 0K YQLSR LLHD YKRDLYNHLK EH EI GPSL YAAPWFLTMF 
ASQFVIA; F VARVFDM I FLQGTZV1 FKVALSLLGF H K PLI LQHEN 
bETIVPFTKSTLPNLGLVOKEKriNOVFEMDIAKOLOAYEVEYH 
VLOEEi,lDSSPLSDNQRKDKLEKTN£SLRKONLDLLEOU?VANG 
RIOPlEATIEKLLSSESXLKCAMLTLELERSAbLOTVEELRRRS 
AXPSDKEPECTOPEPTGD 


5416 


21 


4074 


KSpbFCFWGGKAGDlLSGDOPXEOKDPYFVETFYGyOLDLDFLX 
YVDD10KGNTIKRLN10XRRKPSVPCPEPRTTSGO0GIWTSTES 
LSSSKSDDNKOCPNFLIARSQVTSTF1SKPPPPLETSLPFLTIP 
ENROEPPPSPQLPXHKUrVTKTLMETRRRLEOERATMOMTFGEF 
RRPRLASFC^^Ta'SSbPSFVGSGNHNPAiCHOEiONGYOGNGDYG 
SYAPAAPTTSSMGSSlFHSPLSSGISTPVrNVSPMKLQHIREQM 
AIALKRbKELEEQVRTI PVLQVKI SVLQEEXROLVSQLKNORAA 
SOIMVCGYRKRSYSAGNASOLpEOLSRARRSGGELYIDYEEEEME 
TVEOSTQR I XEFR0L\TADMOALE0K 1 QDSSCEASSELRENGEC 
RSVAVGAEENMNDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADXEIELOOOTIESLKEKIYRLEVOLKETTKDREMT 
K L KQ ELOAAG S R K KVD XATMAQ P LVFS KW EAWO-TR DQM VG S H 
MDLVI>TCVGrSVETNSVGlSCQPECKNKVVGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDXSVSVEVSVCETGSNTEESVNDLTLLXT 
KLHLKEVRS1GCGDCSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPRTADODTSTDLEOVHGFTNTETATLIESCTNTCLSTLDXQ 
TSTOTVETRTVAVGEGRVKDIMSSTKTRSIGVGTLLSGHSGFDR 
PSAVKTKESGVGOJNINDNYLVGLKMRTIACGPPOLTVGLTASR 
RSVGVGDDPVGSSLENPOPOAPLGMMTGbDHYIERlQKLLAEOO 
TLbAEIJYSElAEAFGEPHS^GSLKSOLISTbSSINSVMKSAST 
EEl^RNPDFOKTSbGKlTGSYLGYTCXCGGLOSGSPbSSOTSOPE 
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SEQ 
ID 

NO : ! 


Predictec 
beg inn inc. 
nucleotide 
locat icr\ 
corresponding 
to firsL 
amino acid 
residue of 
amino ecic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide - 1 
(A^AJcnj-ne, C= Cysteine. L^Aspartic Acid, E^ 
Glutamic Acin, ?= Phenyl alanine, G-Giycint, 
H=Hist icine , 1=1 soleucine, K«Lyr>ir.c 
LrLeucine, M-Methicnine , N-Asparac i nt , 
P-Prchne, 0=Glutamino, R-7*rai nine , 
5-Scnne, T-Thrconine, V=Valint, 
W= Tryptophan, Y= Tyrosine, >!»Uiiknowr. , *=Stop 
Codor., /^possible nucleotide deletion, 
\=posfiple nucleotide insertion) 








ESTLK5IMKXKDGNKDSNGAKKNL0FVG3NGGyET7£?SDDSSSD 
ESSSSESDDECDVIEYPI.EBEEEEEDEDTRGMAEGHHAVNIEGL 
KSA^VEnEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTlND 
PKALTSKDMRFCLNTLQHEWFRVSS0XSA3 PAMVGDY I AAFEA3 
S PDVLRY V I NLADGNGNTALKY S VSHSN FE J VKLLJ .DADVCNVD 
HQNKAGY TPIM1 JXALAAVEAE K DMR I V EEL? GCGDVN AKASQAG 
0TALMIAVSHGRIDNVKGLLACGADW30DDEGSTALMCASEHG 
H VE I V KLLLAO PGCNG H LEDNDG STAL S I ALE AGH K D I AVL.L YA 
HVN7AKAQSPGTPRLGRKTSPGPTHRGSFD 






4074 


XS0LFCFWGG:<AGD1LSGDC;DKEC^KDPYFVETPYGYQLDLDFLK 
YVDD3 QKGNTI KRLNIOKRRKPSVPCFEPRITSGCOG: WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENR0J-FPF5;P0LPKilNLHVTKTLMETRRRI>E0ERATM0MTPGEF 
RRPR LA S FGGMGTTS S L PS FVG S GNKNP AKHC-LQNGY OGNGDYG 
SYAPAAPTTSSMSSSIRHSPLSSGISTPVTNVSPMKLQHIREQM 
AIAL.KRLKELEE0VHT3 PVLOVKI SV10EEKR0LVSQLKN0RAA 
SQINVCGVRKRSYSAGNASCLEQLSRARRSGGELY1DYEEEEME 
TVE0ST0R 3 KEFROL\TADMQALE0K30DSSCEASSELRENGEC 
R S VAVGAE E NMND3 WY HRGS R 5 CXDAAVG TL VEMK N CG VSVTE 
AMbGVXTEADKEIELQOUTlESLKEKlYRLEVQLRFTTHDREMT 
K L KOE LOAAG S R K KVD KATMAQ P LV F S KW EAWQTR DQMV GS H 
MDLVDTCVGTSVETNSVGISCOPECKNKAA'GPELPWN^WIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCBTGSNTEESVNDLTLLKT 
NLNLKEVKS3GCGDCSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPRTAD0DTSTDLEOVHQFTNTETATL3ESCTNTCLSTLDKO 
TSTQTVETRTVAVGEGRVKD3NSSTKTRS1GVGTLLSGHSGFDR 
PSAVKTKESGVG0IN1NDNYLVGLKMRT3ACGPPQ1.TVGI.TASR 
R5VGVGDDPVGESLENPQPQAPLGMMTGLDHYIER30KLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNS0L1STLSS3NSVMKSAST 
EELRNPDF0KTSLGK3TGSYLGYTCKCGGLOSGSPLSSOTS0PE 
QBVGTSEG KP I SSLDAFPTQEGTLSPVNLTDDQIAAGLYACTNN 
ESTLK SIMKK KDGNTKDS NGAK KN LQFVG 1 NGGYETTS S DDS S SD 
ESSSSESDDECDVI EYPLEEEEEEEDEDTRGHAEGHHAVNIEGL 
KSARVEI)E«0VO£C^PEKVSlRERrTLSEK>JLSACNLLKOTIND 
PKALTSKDMRFC^NTTLOHEWFKVSSOKSAl PAMVGDY ] AAFEAI 
S PDV LR YV 1 NliADGNGNTALHY S VSHSNFE 3 VKLLI JDADVCNVD 
HQNKAGYTF3 M3^AALAAVEABKDr>5R 3 VEELFGCGDVNAKASQAG 
OTALMLAVSHGR I DMVKGLLACGADVN10DDEGSTALMCASEHG 
HVE 1 VKhLLAQPGCNGH LEDNDGSTAL5 3 ALEAGHKD 1 AVLLYA 


5418 


24 


1133 


SVPRAGGDKETGAAELYD0ALLG3LOHVGNVODFLRVLFGFLYR 

ktdfyrllrhpsdrmgfppgaaoalvlqvfktfdhmarqddekr 
rqeleekirrkeeeeaktvsaaaaekepvpvpv0e3e3dsttel 
dgkoeveicvqppgpvkemahgsoeaeapgavagaaevpr\ep?i 
lpriqeofqknpdsyngavrenytwsodytdlevrvpvpkhwk 

GKQVS VALSSSS 1 RVAMLEENGERVLMZGKLTHKI NTESSLWSL 
EPGKCVWNLS KVGEYKWNA1 LEGEEP1 DI DK1 NKERSmATVDE 
EEOAVLDRLTPDYHQKLgGKPOSHELKVHEKLKKGWDAEGSPFR 


5419 


13SL 


259 


gth pldpd lv £ rts vqg p lmtmac pgmsdtees pflg praaeeg 
seseaceafgrrkseeegrrsdtsgfgrsrkhkvnwkhperada 

XDPASLPOC/LGP/DCTOPAQPSSKYCSDDCGMKLAANRIYEIL 
P0R3 0OW0QSPC 1 ASEHGKKLLER IRREOOSARTRLQEMERRFH 
ELEA1 3 LRAKQCAVREDEESNEGDSDDTDL03 FCVS CGHPI NPR 
VALRHMER CYA K Y ESQTS FGS MY PTR 3 EG ATRLFCD V YNPQS KT 
YCKRLOVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RC<NRHYa<EKl,RRAE\^LERV7<VWYKLDELFE0ERNWTA>lTM 
RAGLLALWLH0T30HDPLTTDLRSSADR 


5420 


117 


173 3 


KEAGGACPFKGGASGRLYLSPRLPRV5VAGCEERPLGWVWVLGG 
GGFL PAR P PRAQRH LGFSHAEQSMEAPDYEVLSVREOLFHER I R 
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SEQ 
ID 
NO: 


beoirjriiny 
nucif;ot ide 
Iocs t 2 on 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nuci eot id* 
location 
corresponding 
to first 
amino acir 
recidue of 
amino acic 
stquence 


Anr.no acid secrnent containmc sional peptide 
(A-Alanme, C=Cysteine, D-Aspartic Acid, R= 
Glutetnic Acid, K=Ph!*ny3 n"i and ne , G=Giycine, 
H=Kistidinc-, 3 = 1 soleucme , K^Lysine, 
L^Leucine, M- Methionine, K=Asparac:me , 
P=Proline, 0=G3utamine, R-Arginine, 
S=f>erine, 7=Tbreonine, V=Valine, 
K=Ti yptcphan, Y=7yrosine, X=Unknown, *-Stop 
Codon, /sponsible nucleotide deletion, 
\=possible nucleotide insertion} 








EC1 : STLLFA?LY1LCKIFLTRFKKPAEFTT\GMMKMPPSTRL7 

llklctftla: au;avlllpfsi iskzvllslprnyyi QWLN6S 

LJHCLWNLVFLFSNL£LiFLMPFAyrFTESEGFAGSRKGVLGRV 
YET VVMLMLLTLLVLGK VWVASA 2 VDKNXANR ES LYD FWE YY LP 
YLy5CISFLGVL^LLVCTPLGLARMFSVTGKLLVKPRLiLEDLBE 
OLYCSAFEEAALTRRlCNPTSCWLPLDMEIiLHROVI^LOTQRVL 
LE K R R KASAWORN LG Y PLAMl.CLLVLTGLS VL 1 V A 1 H 1 l.Et ,L1 D 
EAAK P RGMQGTSLGOVS KS KDGS FGAVI OWL J F Y I *M VS S WGF 
YSSPLFR SLR PRWHDTAMTOI I C- N C V G LLVXS SALPV FS RTLGL 
TRFDLLGDFGRFNWLGNFY1VFLYNAAFAGLTTLCLVKTFTAAV 
RAEL J RAFGERE 


S421 


117 


1732 


neaggacpfkggasgrLylsprlprvsvagceerplcww/vlgg 
ggflparppraqrhlgfsiiaeoskeapdyevlsvreqlfherir 
ECJ i stllfatlyilch:fltrfkkpaeftt\gmm)o^ppstrl/ 

LLELCTFTLA3ALGAVLLLPFSIlSNEVLLSLPRNYYlQWU>rGS 
LIHGLK'NLVFLFSNLSLIFLMPFAYFFTESEGFAGSKKGVLGRV 
YETVVKLMLLTLLVLGK.VWVASA1VDKNKANRESLYDFWEYYLP 
YLYS C I SFLGVLLLLVCTPLGLARHFS VTGKLLVK PR LLED'^EE 
OLYCSAFEEAALTRRICNPTSCWLPLDMELLHROVLAIOTORVIj 
LEKKK KASAW0RNU;YPLAMLCL.LVLTGLS VL I VA 1 HI LELL 1 0 
KAAi^PRGMOGTS LGOVS FS KLGS FGA V I QVVLI FYLMVSS VVGF 
YSSPLFRSLRPRWHDTAMTOHGNCVCLLVLSSALPVFSRTLGL 
•JT?FDLLGDFGRFNWLGNFyiVFLYNAA?AGLTTLCLVKTFTAAV 
RAEL I RAFGERE 


5422 




126; j 


SCGESLPTWLAGASRPGIGRKGGAVIGGRGGSSPAOVLLSPGPVF 
KAGO:WVJHLSRDOAGVORCDLGSSQPP?LX5FKRFSCLSLPSSWD 
YRSTVacVSKMEADLSGFUIDAPRWORTFLGRVKHFLNITDPR 
TVFVSERELDVfAKVMVEKSRMGWPPGTOVEQI .1 -YAKKLYDSAF 
HPDTG E KMNV J G R MS FQL PGGM 1 3 TGFMLQFYRTM P AV 1 FW QWV 
NQS FN AL VNYTNRN AAS FTSVRQMALS Y FTATTTA VATAVGMNM 
LTKKvAPPLVGRWVpFAAVAAANCVNI PMMRCOEL: KGI CVKDRN 
ENE1 GHSRRAAA J G I TQWISR I TMSAPGMI LLPV 3 MERLEKLH 
F?40KVKVL/SAPLQVMb5GCFLIFMVPVACGLFPCKCELPVSYL 
EPKLODTIKAKYGELEPYVYFNKGL 


5423 




905 


GVSI^ALGEEKAEAEASEDTKAOSYGHGSCRERELUIPGPMSGEQ 
FPRLb A hlGGL I S PVWGA EG I PAPTCW 1 G TDPGG PS RAHQ PQAS D 
ANRE P VAERSEPALS GLPPATMG SGDLLLSGESQVE KTKLS S S E 
EFF^TLSLPRTT1CSG}TDADTEDDPSLADLP0AI,DLS00PHSSG 
LSCU CWKSVL£PGSAA0PSSCS1£ASSTG£SL0GH0ERAEPRG 
GSLAKVSSSLEPWPQEFESWGLGFRPQWSPQPVFSGGDASGL 
GRRRLSFOAEYWACVLPDSLPPSPLRHSPLWNPNKEYEDLLpyT 
Y PLR PG PQL PKK LDS R VFADPVLODSG VDLDS FS VS PASTL XS P 
TNVS?NCPPAEATALPFSGPREPSLKOWPSRVPQKOGGMGLASW 
SOLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDnGVJPSPRPEREKRTSQSARRPTCTESRtfKSEEEVESDDEY 
LALP7^LTOVSSLVSYLGS 3 STLVTLPTGDIKGOS PLEVSDSDG 
PASFPSSSSOSOLPPGAALOGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSOALGVSSGLLKTRPSLPARLDRWPFSDPDVEGOLPRK 
GGE0GKESX.V0C\VKTFC\COLEELICWLyNV\AOVTDHGTPAR 
SNLTSLKXSSWLyROFKjaJlDEHOSLTESWJKGElLLOCLLE 
NTPVLEDVLGRIAKOSC-ELESRADRLYDS ILASLDMLAGCTL1 P 
DKKPKAAMEHPCEGV 


5424 


3l8fc 


905 


GVSMALGEEKAEAEASEDTKAOSYGRGSCRERELDJFGPMSGEQ 
PPRLEAEGGL.3 SP W7GAEG1 PAPTCW3 GT DPGGPS RAKOPOASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESOVEKTICLSSSE 
EFPOTLSLPRTTICSGKDADT^DDPSIADLPOAJLDLSOOPHSSG 
LSCLSC-WXS VLSPGSAAQPSSCS I SASSTGSSLOGHOERAEPRG 
GSLAKVSSSLEPVVPQEPSSVVGLGPPPOWSPOPVFSGGDASGL 
GRRRLS FQAEYWACVLPDSLPPS PDRHSPLKNPNKE YEDLLDYT 
YPLRPGPOLPKHLDSRVFADPVLODSGVDLDSFSVSPASTLKSP 
TNVSPNCPPAEATALPFSGPREPSLKOWPSRVPOKCGGMGLASW 



3J8 
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ID 

NO: 

i 


Predictec 
beoj nninc 
nucl eot ice 
i oc2 1 j on- 
cer responding 
to fiis; 
amino &cic 
residue c: 
amino &cic 
sequence 


Pred:cted end 
nucleotide 
loca: ] on 
correfpordino 
to first 
amino acic 
residue of 
amino acic 
sequence 


Ammo acid segment containing signal peptide 
(A-Alcnine , OCysteine, D=Aspartic Acid, E- 
Gl uiami c Acid, F= Phenylalanine, G-G2ycme, 
H=Histidine, Z = lsolevcmc , K=Lysine, 
l.= Leucine, M«=Methiona ne , N^Asparagi ne , 
P^Proline, O^Clutoiiir.e , ^-Arginine, 
S=Serine, T-Threonine, V-Valine, 
W=Trypt ophan, Y=Tyrcsine, X-Unknown, *=£top 
Coaon, /=possible nucleotide deletion, 
\=possible nucleotide :nsertionJ 


1 , 

1 






SOLASTPRAPGSRDA^WE^REPALRGAKDKLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSOS\ARRPTCTESRWKSEEEVESDDEY 
IiALPARLTQVSSLVSYLGSlSTLVTLPTGDIKGOSPLEVSDSDG 
PASFPSSSSQSQLPPGAALCGSGDPHG^NPCFLRSFVRAHDSAG 
EGSLGSSOALGVSSGLLKTRPSLPARIiDRWPFSDPDVEGObPRK 
GGE0GK£SLVOC\VKTFC\C0LEFLICWLyNV\ADVTDHGTPAR 
SNLTSLKXSSLOLYROFKXDIDKHOSLTESVLOKGEILLOCLLE 
NTPVLEDVLGR IAXQSGEIiESKADRLYDS 1 LASLDKLAGCTLIP 
DKKPMAAKEHPCEXJV 


| 5425 
1 




lis 


GFCP5 PS LGHQ P PR VI/HPTMSMA VETPGF FMAT VGLLMLGVTI P 
NSYWRVSTVHGNVITTNTIFENLVJFSCATDSLGVYI^CWEFPSML 
AL.SGYI0ACRALMITA2LLGFLGLLLGIAGLRCTN1GGLELSRK 
AKLAATAGAPH \ I LPG I CGMVA I \ S WYA FN I TR \ DFSDPLYPGT 
KYELG?ALYLGWSASl>ISILGGLCLCSACCCGSDEDPAASARRP 
yOAPVSVMPVATSDCEGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRpVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLl F 


S426 

j 

i 

i 
i 

i 




3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDOPSAPSDPTDQP 
PAAHAK PD PGSGGOPAG PGAAG EALAVLTS FGRRDLVL 1 PVYLA 
GAVGLSVGFVLFGJLALYLGWKRVRUfc'KUKSbRAA^OLLDDEEQL 
TAKTb Y MSHRE LP A WVS FPDVE KA EWLNK I VAQVWPFLGQ YMEK 
LLiAETVAPAVRGSNPHXOTFTFTK VELGEKPLR I IGVKVHPGQR 
KEQ 1 LLDLNI S Y VGDVQ I DVEVKK Y FCKAG V KGMQLHGVLR VI L 
EPLIGULPFVGAVSMFFIRRPTliPINWTGMTNLLDIPGI^SLSD 
TM1MDS I AAF1> VLPNRLLV? L V PE LQDVA0LR5 PLPRG 1 1 R IHL 
LAARGLSSKDKYVKGLIEGKSDPYAV/RLGT0TFCSRV1PEELW 
POMGETYEVMVHEVPGOEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQAS VLDDWFP IjQGGQGQVH I >R LbW 7,SL^DAEKLEQVLQWNWG 
VS SRPDPPSAAILVVYLDRAQDLPMVTSELY PPQLKXGNXEPNP 
MVOLSIQDVTOESKAVYSTNCPWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGAXTLPLARLLT APEL1 LDQWFQLS S3G PNS RLYM 
KLVMR I LYLDSSEI CFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTP DSOFGTEHVLR I HVLE AODLI AKDR FLGGLVKGKSBPY 
VKLKLAGRSFRSHWREDLNFRWNEVFEVIVTSVPGOELEVEVF 
DKDLDIC^DFLGRCKVTU^TTVLNSGFLDEWLTLEDVPSGRLHLRL 
ERliTPRPTAAELEEVLQVWSLlOTOKSAEIiAAALLSIYMERAED 
LPLR KGTKHLS P Y ATLTVGDS S HKTKT I SQTS AP WtDES AS FL.I 
RKPKTESLELQVRGEGTGVLGSLSLPLSELLVADQIiCLDRWFTL 
SSGQGQVLLRAQLG I LVSQHSGVEAKSHS YSHSSSS LSEEPELS 
GGPPHIT3SAPEV\RORLTHVDSPLEAPAGPU50VKLTLWYYSE 
ERKLVSIVHGCRSLRQWGRDPPDFYVSLLLLPDKNRGTKRRTSQ 
KKRTI>SPEFNERFEWELPLDEAORRKLPVSVKSNSSFMSREREI* 
LGKVOLDLAETDLSQGVARWYDIJviDNKPKGSS 


| S427 

i 
i 

i 

i 
1 

1 

j 


4. 


3435 


ATSSOS LGRADPPRGGTMERS PGEG PS PSPMDC-PSAPSDPTDQP 
PAAHAK PD PGSGGQPAGPGAAGEALAVLTS FGRRLLVL1PVYLA 
GAVGIiSVGFVLFGLAljYlX5WRRVRDEKERSIjRAAJR0LLDDEEQL 
TAKTLY MS HREIi PAW VS FPDVE KAE KLNKI VAQVWP FLGQYMEK 
LLAE TVAPAVRGSNPHLQTFTFTR VELGEKPLR I IGVKWPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVXGMQLHGV1.RVIL 
EPJblGDLPFVGAVSMFFIRRPTLDI fTWTGMTNLLDI PGLSSLSD 
TMIMDSI AAFLVLPNRLLVPLVPDLODVAQLRSPLPRGI IRIttI* 
lAARGLSSKDKY^KGLIF^KSDPYAL\T?I^TOTFCSRVIDEELN 
PCWGETY EVWVHEVPGQBIEVE VFDKDPDKDDFLGRM KLDVGKV 
IX?ASVLDDWFPI>0GGQG0VHLRLEWL5LLSDAEKLEQVLQWNWG 
VSSRPDP PSAAI LVVYTJDRAODLPKVTSEbYPPOLKKGNXEPNP 
MVQLS 1 QDVTOESKAVYSTNCPVWEEAFRFFLQDPOSOEbDVQV 
KDDSRALTLGALTLP LARLLTAPELI LDQWFQLSS SG PNSRLYM 
KLVMR I LYLDSS E I CF PTVPGCPGAWDVDSEN PQRGS S VDAPPR 
PCHTTPDSQFGTEHVX.RIHVLEAODLIAKDRFLGGLVKGKSPPY 
VKLKLAGRS FRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DI05lJDKDDFl^RCKVlUiTT\niNSGFLDEWLTLEI)VPSGRLHI^L 



319 
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WO 01/53312 
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f,E0 
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NO: 

I . _ 


Predicted 
beainninc 
nucleoc ide 
Jocation 
cor responding 
to first 
amino acid 
residue of 
atniro acic 
sequence 


Predicted end 
nucleot ice 
locat : en 
correspond i ng 
to first 
amino acid 
rpEicjc o* 
amino acid 
sequence 


Anvino acid segment ccniaming sigr.t^ peptide 
(A-Alanitie, c=Cyst r inc. . D=Aspartic Acid, E- 
Glutamic ACid, F= Phenylalanine , G=C'^ycine, 
KsKistidine, I^Isoieocme, K-Lysir.t , 
L« Leucine, M=Me t. hi cr. me , N=Asparao : r.e , 
F-Proline, Q*G2 utarr 2 r.e , R*=Arginine. , 
S*Serine, T= Threonine, V=Valuie, 
W^Tryptophan, Y^Tyiosine, X=Unk.nowr. , *=Stop 
Cccon, /=possib!e nucleotide delete cn, 
\=possible nucleotioe insertion) 


i 






KF.LTPRPTAA5LEEVL0VKSL1OTQKSAELAAAU S1YMERAET) 
LP 1- "R KGT KH LS P Y AT LTVGL £ £ H K T KT 1 S QTS AP V'v.'D E S AS F L I 

| rkphteslelqvrgegtgvl6s lslplsellvatx lcldrwftl 
s?.cqgqvl,lraqlgi lvsqhsgveah5hsyshs£?.£ ls ee pels 
gg p phi tss apsv\rqrlthvds pleafagplgqvk ltlw yys e 
er klvs i vhgcrslronc^dppdpy vsllllpdkjorgtkbrtso 
k k rtlspefne3 ffwelpldeaqrr kldvs vksns £ fms rerel 
lgkvqldiaetdlsogvara'ydlmdnkdkgss 


! 


3 


IK 39 


ssrserleacaiappwlvsskfakpaolorpgkmvedgaeeled 
lvhfsvselpsrgygvmeeirrogklcdvtlkigdhkfsahriv 
laasipyfhawfrndmmeckcpeivmogmdpsai,e;allnfayng 

NLA I VQQWQSlsljMGKSrLOLQSl JCDACCTFLREPJ.KPKNCLGV 
R0KAErMMCAVl.YDAANS?lH0HFVEVSMSEEFl*7-.!,PLEDVLEL 

vsrdelnvkseeovfeaalav^vrydreorgtfl\kklqsnirll 
fcrpqflsdrvqoddlvrcchkcrdlvdeakdylln.ferrphlp 
afrtrprcctsiagliyavgglnsagdslnwevfdfiancwer i 
crpmttarsrvgvawngllyai ggydgqlrlstvcayntetdt ! 
wtevgsmnskrsamgtwldgqi yvcggydgnsslpsvetyspe i 

TDKWTVVTSMSSNRSAA\GVTVFEGRIYVSGGHDGliCIFSSVEM ; 
YK HH TATWHPAAGMIiNKKCR HGAAS LGSKMFVCGG Y [X?SG FLS I 
Afc'N YSS V \ ADQWCLI VPM\ HTRR \ SRVSLGGPAVG R L YAW?G VT 
TGOSNLXSSVGDVLTPETDCa'TFMNAPMACKEGGVuVGCIPLLT 

I 


£. 4 ? ^ 

i 

t 


828 




RP h DALS CEGCLW PSES TVSGNG 1 PEPQVYAPPRP: URLAVPPF 
ACRERFHRFQP7YPYL0HEIDLPPTlSbSDGE£PFPYQGPCTL0 
LHEP EQQLELN RESVRAP PHKT 1 FDSDL>MDSARLC~G PCF P SSM S 
61 £ATCYGSGGRMEGPPP\TY SEV 1GHY PoSSFQHOOSSG PPSL 
LEGTRLHHTH I APLES AA1 W£ KEKDKQKGHPL 


| S430 

1 

i 




1507 


okrkkrrrkklnktiqpkajimsiswaiftclaalclfogvpvrs 
gdatfpkamdnvtvrqgesatlrct3 dnrvtrvaw1 mrst1 lya 
gnvk wcldpr wllsntqtq y s1ej qtfvdvydeg p y 7'csvqtdn 
hpktsrvhlivqvspk1ve1ssd1s1negnnisltciatgrpep 
tvtv;rhispkavgpvsedbyleiqgitreosgdyecsasndv\a 
apv\vrrvkvtvnvppy2seakgtgvpvgqkgtloclasavpsa 
efowy kddxrl 1 /egkkgvk v enrpfls kl1 ffnvs zhdygn yt 
cvasnklghtnasimlfgpgavsevsngtsrragcvwllpllvl 

KLLLKF 


54 31 


2 


1312 


AAJ^.PGSRRRRPLPDRPHI^GYEAPPPPAPRSPAWWiRSKPV\ 
LPGIT1NP\TIAEGPSP\TSEGASEANLVDL0KKL£ELELDEC?Q 
KKRIEAFLTOKAKVGELKTDDFER I SELGAGNGG W 1 KVQHRPS 
GLl MARKL3 HLEIKPA1RK01I RELOVLHECNSPYJ VGFYGAFY 
SDGEI SI CMEHtf DGGSLDCVLKEA KR I PEEI LGKVS : AVLRGLA 
YL R EK«0 1 fWRJDVKPSN I LVNS RG EI KLCDFGVSGOL I DSMANS 
F VGTRS YMAPERLOGTH YSVQS DI WSKGLSLVELA VC-R YPIPPP 
DAXELEA 1 FGRP WDGE EGE PHSIS PRPRPPGRPVSGKGMDSRP 
AMAI FELLDYI VNEP PPKLPHGVr TPDFQEFVNKCLI KNPAERA 
DLxyiLTNKTFI XRSEVEEVDFAGWLCKTLRLNQPGT FTRTAV 


5*32 

! 


2 


1312 


AAA^PGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPG3TIKP\TIAEGPSP\TSFGASEANLVDLQKKLEELELDEQQ 
KKR I E AFLTQKAKVGELKDCD FER I S ELG AGNGGW'TKVQHRPS 
GL I MAR KL I HLG I KPAI RNQ I J R E LQVLHECNS PY 3 YG FYGAFY I 
SDGEI S ICMEHMDGGSLDOVLKEAKRI PEE I LGKVS 1 AVLRGLA 
YLREKHQIMHRDVKPSNI LVNSRGEI KLCDFGVSGCLI DSMANS 
FVGTKSYWAPERLQGTHYSVOSDIWSMGLSLVELAvr.RypiPPP 
DAKELEAI FGRPWDGEEGEFHSISPRPRPPGRPVSCHGMDSRP 
AMAI FELLDYI VNEPPPXLPNGVFTPDFOEFVNKCL : KNPAERA 
DLKK LTNKTF1 XRSE VEEVDFAG WLCKTLRLNQPGTF 7R TAV 


5433 

1 


360 


1865 


S VQE D KVG F ED P LHLCS WRARACP CT W PHC/ CTGLLfc C LG FAGV 
LFGW PSLVFVF JOJED YF KDLCG P DAG P I GNATGCADC KAODERF . 
SLIFTLGSFMHNFMTFPTGY1 FT>P.FXTTVARL3AIFFYTTATX*I | 
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SEC 
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Predicted 
beginning 
nuol ect i de 
1 oca tier, 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


I r edacted enc 
r.ucleot i d<- 
~a ocat ion 
cor rcf-pondi ng 
to {±ry\ 
etr.i no tcic; 
residue o: 
am: no acic 
f.ecnJencF 

~ 


Amino acid tcyment containing signal peptide ~ 
(A»Alanine, Cyst t: ne, D-Asportic Acid, E = 
Glutamic Acid, F-Fheny] alanine , G=Glycine, 
H-Histidine , 1 =Isoi evjci ne , K=Lysine, 
[^Leucine, M-Met hneni ne , N=Aspar5gine , 
P=Pro}ine, 0=G 1 ut air.jnc , •UArginine, 
S-Serine, T=Threonir.e , V=Valine, 
W = Tryptophan, Y=Tyro£ine, X=Unknown, *^Stop 
Codon, /=possible nucleotide deletion, 
V- possible nucleotide insertion) i 


\ 






I AFTSAGSAVLXiFLAMPMLTl GG I LFL I TNLQ J GNLFGQHRST1 
ZTLyNGAFDSSSAVFLI IKLLYEKGlSLR/VLLHLHLCLQYIjAC 
STHFPPDAPGAHF1 pfAFQL-OLWPVf-WEWIfHKGK EG/00 LSMKT 
GSYSQPSSF0RRKKP0GOGKSRNSAPSGATL/CSRRFAWHLVWL 
£ V I QLWHYL?! GTLNSLLTNMAGGDKARVSTYTNAF AFTQFGVL 1 
CA PWNCLLMDR LKQ KYQKEAR K7GSS TLA VALCSTVPS LALTSL 
LC^GFALCASVPILPLQYLTFILOVjSRSFLYGSNAAFLTLAFP 
SEHFGKLFGLVMALSAVVSl.LQFP 1 FTLI KGSLONDP FY VNVT4F 
WLAIliLTFFHPFLVYRECRTKKESPSAJA 


5434 \ 


66 




RYAALIISLI0HKLLWRW0HCSRCV1MSPAQSAGLNWLF/GSGK 1 
HG P r 1 JGCSQYPACDYVR PLKSSADGHI VKX'LEGQ VCPACGANLV 
LROGRFCMFIGCINYPECEHTELJDXPDETAITCPOCK'-TGHLVO 
RRSRYGKTFHSCDRYPEC0FA3NFKPIAGECPECHYPLL1EKKT 
AOGVKHFCASKQCGKPVSAf 


5435 


4704 


159"/ 


PGDSSORLAEMSNAKERKHAKKMRrgOPTNVTLSSGFVADRGVKH 
HSGGEKPFOAOKOEPHPGTSRQROTRVNPHSLPDPEVNEQSSSK 
GMFRK KGGWKAGPEGTSQEI PKY 1 TASTFAQARAAEI SAMLKAV 
T0KSSNSLVF0TLPRKMRRRAT<SHNVKKLPRRLOElAOKEAEKA 
VHQKKEHS KNKCHKARRCKMNRTLEFNRRQKKN I WLETH I WHAK 
RFHMVKKWGYCLGEKPTVKSHRACYRAMTNRCLLODLSYYCCLE 
LKGKE EE I LKALSGMCN I DTGLTFAAVHCLSGKROGSLVLYRVN 
KYPREMLGPVTKlWKSORTPGDPSESRQLWIWLHPTbKODILEE 
3 XAACQCVEPI KSAVCI ADPLPTPSQEKSOTELPDEK IGKKRKR 
KDDGEWAKPIKKIIGDGTR^PCLPYSNISPTTGI I ISDLTMEMN 
RFRLI GPLSHSI LTEA1 KAAS\7HT\^GEDTEETPHRWWIETCKXP 
DSVSLHCRQEAI FELLGGI TSPAEI PAGTILGLTVGDPRINLPQ 
KKSKALP^3PEKC0DNEKVROhLLEGVPVECTl?SFlW^JQUl CKSV 

TGEDRU3WGSGWDVLLPKG>JGMAFW:P7IYRGVRVGGLKESAVH 
SQ Y KR S PNV PGD FPDCPAGMLFAEE0AKN1.LEKY KRR P PAKRPN 
YVKLGTLAPFCCPVJEOLT0DWESRV0AYEEPSVASSPNGKESDI, 
RRSEVPCAPMPKKTH0PSDEVGTS1EHPREAEEVJ4DAGC0ESAG 
PERI TDQEAS EN H V AATGSH LCVLR SR KLLKQLSAW CGP S SEDS 
RGGRRAPGRG0QGLTREAC1.S 1 LGKFPRALVWVSLSLL.SKGSPE 
PKTMI CVPAKEDFLQLHEDWHYCG POES KHSDP FRS KI LKQKEK 
KKREKJ*QKP\CRASSDGPAGEEPVAGOEAimX?LWSGPLPRVTL 
HCSRTI^LGFVTOGDFSMAVGCGEALGFVSLTGLLDMLSSOPAAQ 
RGLVLLRPPASLQYRFARIAIEV 


5436 


17 81 


63T 


ASDS1PWSEARTTRK1AQRGC0WSLPERMPLWFCGLPYSGKSR 
RAEELRVAl^AAEGRAVYVVDDAAVLGAEDPAVYGDSAREKALRG 
ALRAS VERRLS RKDWI LDS 1 .MY I KG FR Y ELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSVVNGSAOADVPKEI.EREESGAAESPALVTPD 
SEKSAKRGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPUAGIRSAbFENRAPPPHOSTCSOPLASGSFLHOLCOVTS 
0VLAGLMEAOKSAVPGDLLTLPGTTEHLRFTRPLT74AELSRLRR 
QF1 SYTKMHPNNENItPQLANMFLOyi-SOSLH 


5437 


739 


1672 


COF^ASEFGGPLHTPAiVJFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSEN$GSDWDSAPETMEDVGHPKTKX>SGALRVSRAASE 

pskeepoveqlgskrmdslkwd0pisst0esgrleaggaspklr 
hdhvdsggtrrpgvspegglvgvpgpgaplekpgrrekllgwlr 
gepgapsryrx-gpeeclx?istnltlhll£lijvsal>lalcsrplr 

AA1X)TI>GLRGFLGLWLHGLLSF1Ju\U-:GLHAVLSLLTAHPLHFA 
CLFCLLQALVLAVSLREPNGDEAArDKESEGLEREGEEORGDPG 
KGL 


5438 

i 


2443 


11S2 


TKPRKRRHOPASORORPWSSDSTGDbLARGKGRKEENKGSDRVS 
IAPPSLRRPWC0SFAJ?CX5PELRAAKV?LKFPQliALRRJ^LG0LSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLbSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHHYKKLSEFFRRKLKPOARPVCGLHSVISPSDGRJLNFGQVK 
NCEVEQVKGVTYS LES FLGPRMCTEDLPFPPAASCES FKNOLVT 
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BNSDOCID: *WO_ 0153312A1 J. : 



WO 01/53312 



PCT/USWI/J.I263 



NL . 


Predicted 1 Predicted end 
beyinnino | nucleotide 
nucleotide- | 1 oca tier 
location | corresponding 
corresponding j to fir?: 
to first : amino scad 
amino acic | residue of 
residue of j amino acid 
amino acid j sequence 
sequence 1 


Amino acid rt cr.cn c ccrtaining^VIVria"! jTepVl oe ] 
(A* Alanine , •:. - Cy^te^r.e, B-Aspiirtic Acid, E- 1 
Glutamic Acici, p= Phenyl alanine , G--Glycanc , 
K=Histidine, ] - 1 soJtucine, K«.Lysine, 
L=Lt:ucine, i<-?<ft thicr.ir.e , N^Aspc.raoim , 
F=Pre3ine, ( = Gi ut amine , RWtrginme. 
S-Senne, 1 > "..'Y.J troni ne , V=Valine, 
W-7ryptoph<i:. , Y= Tyrosine, X=Unknown, *=.-Stop 
Codor., /rpot-K :c2c nucleotide c<Oetior., 
\ = pocsible nucleotide insertion) 








PEUNiLYKCVlVI^.PGDYHrFHSPTDWTVSHF.RlIFPGSLMSVNP^ 
GNARWTK£LPOr< EN WLTGDWKHGFFSLTAVGATVWC-S I R I Y 
FDKOLKTNSPRHF K0STODFSFVTHTNREGVPMAL-R3EKLG/OS 
FNLGST1 VL1 FLA. PKDFNF0LKTGQK3RFGKALGS1. 


542S* 


2443 

■ 


1152 


TKPKKRRHOPASCSC/xPW.csDSTGDLl.ARGKGRKEENKGSDRVS 
LA P P S LRR PMM CC S E/vROG PFLRAAKWLH FPCLALR R R LGOLSC 
MSKPALKXRSWPl,TVl.y71^PFGALRPLSRVGWRPVSRVALYKS 
VrTRLLSRAWGK^NCVELPHWLRRPVYSLYJV.'TFGVN-MKEAAVE 
DLHH YiLNLSEFp > HKLKPQARPVCGLHSVJSPSDGR3IiNPG0VK 
NCEVEQVKGVTY5 LES FLGPRMCTEDLPFPFAASCDSFKNQLVT 
REGNELYJiCVJ Y' .AJ J GDY)1CFHSPTDWTV.9KRRHFPGSLMSVNP 
GKAR W l KELF'CHN T R WLTG D WKHG F FS LTAVG AT \ N WGS I R I Y 
FCRDI.KTNSPRH5r'XGSYNOFSFVTHTNREGVFK7aRGP.HLG/OS 
FKLG S T I VLI FEA P *0 FNFQL KTGQK 1 R FGEALGSl , 


j 544C 

| 

i 






EFlPVTPDHRLV*j"KTK3V\0TFSPVNS\GOFFNYEMLkEEOEVA 
MLGAPHNPAPPrrTVJKJRSETSVPDHVVWSLFNTLFMNTCCLG 
FlAFAYSVKSRrf. XKVCDVTGAQAYASTAKCINI WAL.ILG3 FMT 
ILL1 JIPVLWOAOS 


1 

1 

i 
1 

i 

1 


2 


2 OS.. 


CRDG G KNG KMX'S I 74KPLE 3 K TQCSGPRMDPK J CPAD PA FFS F J N 
NStl,v;VANIETG: Ff<IU.TrCHOGI>SNVLDDPKSAGVATFVIOEE 
FDR FTC Y WWCPTArviEGS EGLKTLR1 LYEEVLTSEVEV 1 HVPSP 
ALEE P X TDSY R y \ F TG S KN P K I ALKLAE FCTPSQGK I VS TCE KE 
LVCPFS5LFPKV;-. Y I ARAG V? T R DG K YAW AM FLDR PQQ W LQLV 1 »1 » 
P PAL F . : P S TEN F ZC \ R LAS ARA V PRNVQP Y W Y E E VTNV W I NVH 
D J FY PFPOSEG KL E LCrXRANECKTGFCHLYK VTA VLKSQGYDW 
SEPFSPGEGEOSI.^NAJWVNEETKLVYFOGTKDTPLEHHLYVVS 
YEWiGElVRLTTPGFJTHSCSKSONFDMFVSHYSSVSTPPCVKVY 
KLSGPDDDPLHKOPRFWASMKEAAKIFHFHTRSDVRLYGMIYKP 1 
HALCPGKKHPTVLFVYCGP0V0LVNNSFKG1 KYLRLNT1ASLGY 1 
AVW 3 DGRGSCO)' G LK FEGALKNQMGCVEI E DQ VEG LQ FVAE K Y j 
GF 1 I>1 .SR VAlHGW S YGGFLS 1»MGL3 HKPQV FKVAI AGAP VTVWM | 

aydtgvterymdvpehtnohgyeagsvalhveklpnepnrllilh 
gflde^ivuffhtkflvsoliragkpyclovalppvepoiypner 
hsircpesgehyevt;.lkfloeyl 


S4 4Z 

i 
i 

i 

• 
i 

i 
i 

j 


1 


3474 


CCQRSRRRSPDW FAK?AAKKAPKGKDAPKGA?KEAPPKEAPA£ 
APKEAPPEDOSF;;.IEPTGVFLKKPDSVSVETGKDAV\'VA:<VI'JG 
KELPDKFTIKWFKGKVJLEt^SKSGARFSFKESRNSASNVYTVEL 

h i gkw'lgdrg y y levkakdtcdscgfni dveaprqdasgqsl 
esfkrts ekksd7 /-.g eudfsgllkkrevveeek k.k k kkddddlg 
1ppeikellkgakkseyek1afqygitdlrgmlkrlkkaxvevk 
ksaaf-jkklidpaycadrgnklklmvsisdpdltlicwpkngoeik 
psskyvfenvgkkf. 3 ltinxctiaddaayevavkdekcftelfv 
keppvxjvtplfl'c'qvfvgdrvemavevseegaovmwmkdgvel : 
tredsfkaryr fkkdgkrhil1 fsdwqedrgr yqv3 tnggqce j 
A£liveekqlevicd:ax>ltvkaseqavfkcevsdekvtgkwyk \ 
ngvevrps kri ti shvgrfhklvjddvrpedegdytfvpdgyal j 
gs lsaklnfle i kv e y v pkq\eppki plgfasgg ktsenad/ 1 v 
wagnklr ldv\s:tgeapspfat\wlkg\devftttegrtrie 
kr vdcssfv1esacredegryti kvtnpigedvas i flo wd vp 
dppeavr1 tsvgeewailvweppmydggkpvtgylverkkkgso 
rwmklnfevfts?:'yestk>11eg3lyemrvfa\ r naigvs9psmn t 
tkpfmpia ptsz p^hl 3 vedvtdttttlkkr ?pnr 3 gagg 3 dgy 

LVE YC LEG S EE W VF -ANTE P VERCGFTVKMLPTG AR T LFR WGVN 
3AGRSKFATLAQP\T2REIAEPPKIRLPRHLROTYIRKVGEOLN 
LV\' P F QC : K PR PQV v r i\ T KGGAP LETS R VH VRTS D FDTV F F VR QAA 
RSDSG E YELS VQ I Z NMKDTAT3 R J RWEKAGP P 1 N VMVXE VWG T 
NALVE WOAPKDDGN 1 SE 3 MG Y FVQKADKKTMJSWFNV YERNRHTS C 
TV£DL3VGNEYYFR\A'TEN1CGLSDSPGVSKNTAR3L»KTG3TFK 
PPEVKEHDFRMAPKPLTPLIDRWVAGYSAALNCAVRGHPKPKV 
VWMKNK^E 3 REDFK Fl 1 TWYOG VLTLNIRR PS F FDAG TYTCRA V 
NEZXPEALAECKLEvTRVpQ 
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BNSDOCtC: <WO _0l 53312A1_1_> 



WO 01/533)2 



POT/ 1 S<)0/34?63 



SEQ 
ID 
NO: 

1 


Precictec ] 
Dec^inmnc ! 
nucleotide I 
location 
corresponding 
to first 
am:r.c acid 
residue cf 
omir.o acic 
sequence 


Predicted e;:d 
nucleotide 
location i 
correspond: no 
to first 
ammo acic 
residue of 
amino acic 
sequence 


Am no acid tecmen: containing sic.juj peptide 
'A^Alanine, C=Cystcinc, D-A^part i c /-.cid, E = 
Glutamic Acic, F^Phenyialanine , G-G^ycinf. . 
H^Histidine, 1 = 1 soleucine, K-.LyKini., 
L=L»em:2ne , M=Met hionme , N=7^parao\. ne , 
Frprolane, Q=Glut.arrine , RsArginint: . 
f-Serme, T^Threoninc, V=Vsiinc, 
^Tryptophan, Y«=Tyrcsine, X = Unknowr. , * = Stcp 
CocJon, /-possible nucleotide deletion, 
\=po5sible nucleotide incextion) 


5443 


66 1003 

j 

j 

i 


KKGOLDAG0S S EOKG G NRQPEQS K S R St' S SSSS PF = SRSAA t PA 
MAL.SMFLNGl>KtEDKEPLI£LFVKAGSDGESIGNCPFSOf<LF>jI 
IWLKGWFSVTTVDLKRKPABLQMLAPGTHPPFITrNSKVKTDV 
NKJ EEFLEEVLCPPKYLXLSPKHFESNTAGMDIFAXFSAYlKNS 
nPeANEALERGIjL.KTLOKLDFyLKSPbPDElDEN'i'KEDIKFSTR 
K FLDGNE MTLADCNLLPKLKI V KWAKK Y RN FDI F K F MTG I K R Y 
LTNAYSH DE FTNTCPSDK EVE I \ AY S DVAKRLHQV r.S R LLK h V S 
FMSSP 


S444 


2 j 344 

j 
j 
i 
j 


fGPIGlTTGAOI'iAKWLRDYLSFGGRRPPPOPPTPDYrESDiLk/vY 
R>\OKNLEFEDPY*DSESRLEPDPAG?GDSKNPGDAKyGS?K}:RL 
: KV EAALKARAKALl/JG PGEELE ADTE Y LD PFDAC FHPAFP DDG 
YMEPYDAOWVMSELPGRGVOLYDTPYEEODPETADC-PPSGOKPR 
CSR MPQEDER PADEY DQPWEWKK DH 1 S KAFAVQFE I FEWER*} PG 
SAKELRRFPPRSPQPAERVDPALFLEKCFWFHGPU^HADAF.SLl. 
5LCKEGSYLVRL5FTNP0DCSLSLRSSOGFLHLK1-A.RTRENOVV 
LGOHSGPFP£VPEI_VLHYSSRPLPVOCAEHLALIiYF\rVTOTF*Q 
+ PDWGDRRPNGQVATGLPELWGAEAPS/LAAJ1PGLRVERHPEGLP 
KAE K PGL RG P LLGLR E PLGAG PRC P WG LQ F PRR CC V V WF SOAP AH 
OGGGCGYGQSOGPSGRPRGGAGSRH 


5445 

! 




486 


J L SRGFLGSVE I CI ObPLPASEPVLLLTWARRRWKLTRSR REFT 
TLRAOSVCPWWI* ETRMTTOS3PVEVDFSEPYPS0L1 KPJFEYSP 
EEESEPPAPN1RNMAPNSLSAPTM1.HKSSGDFS0A>?STLKI>ANH 
CRPVSROVTCLRTOVLEDSKDSFCRRHPGLGKAFP£GCSAV£EP 
AfESWGALPAE'HQFSFMEKRNOWLVSOLSAASPDTGHDSDK£D 
CSLPNASAJ>SLGGS0EMVQRPQPKRNRAGLDLPT1P.TGYPSGPQ 
EVLG 1RQLER P1.PL7S VCY PQDLPRPLRSREF POFEPOR Y P AC A 
C^bPPNI>SPKAPWNYHYHC?GSPDHOVPYGHDY?RAAyOQV10P 
ALPGQPLFGASVRGLKPVOXVILNYPSPWDOEERFAQRDCSFFG 
LF RKODOFHHQPPNRAGAPGESLECPAELRPQVPgi- PS PAAV FR 
rPSNPPARGTLK-rSNLPEELRKVFITYSKDTAMEWKFVNrLLV 
NGFOTA I DI FEDR I RG I D I] KWMER YLRDKTVMI I VA ISPKYKQ 
CVEGAESCLDEDEHGLHTKYIHRHMOIEFIKOGSMia RFIPVLF 
PNAKKEHVPTWLONTKVYSWPKKKKNILLRLLREEFYVAPPRCP 
LFTLQWPL 


5446 

1 
i 
1 


972 


161 


S5WSWCT^RMRKrRLWGLLWKL,FVSELRAATKLTECKYELKEGO 
TLDVKCDYTl.EKFASSQKAWQI IRDGEMPKTLACTEK PSKNSHP 
Vf'VGRI I If-DYHDrlGIjliRVRMVNLOVEDSGLYQCVI YQPPKEPIti 
VjI FDRI R LWTKG FSGTPGSNENSTQNVYK 3 P PTTi KALCF LYT 
7FRTVTOAPPKSTADVSTPDSE1KLTNVTD12RVPVFN1V1LLA 
GO FL SK S LVF S VL FAVTLR S FVP ♦ Ail E FTRMS SDFC F HPSG 3 CA 
KC-GGRR 


j 5447 

j 
i 

! 

i 
j 


207 


617 


^TTAJRTLS LMAS LVAYDDSDS EAETEHAGS FNATGOOKDTSGVAR 
PPGODFASGrLDVPKAGAOPTKHGSCEDrcGYRLPLAQLGRSDR 
GSCPSQR LOW PG K E PQ VTF P 1 KEPS CS £ LWTS HV PAS HMPLAAA 
RFXOVKLSRNFPKSSFKAQSESETVGKNGSSFQKKKCEDCWPY 
TF K R ZiRQ ROAJL S TETGKG KDVEPCG P PAG RAP A P L Y VG PGVS E F 
lGPYLNSHYKE7TVPRKVLFHliRGHRGF\W3QWCFVLSKSK«L 
I^7SMDKTFKVW!^AVDSGHCL0TYSLHTEAVRAARV;APCGRR I L 
SGGFDFAXHLTDLETGTOLFSGRSDFR1 TTXKFHPXDHM3 FLCG 
GFS SEMK-'XWDI RTGKVWRS Y KAT1QGTLDI LFLR EGS E FLS STD 
ASTRDSADRTl J AWDFRTSAK1SNQ1FHERFTCPS1ALKPREPV 

r w i in 2 ixrvtjr o j v 3 a I » rvlx rv xv i tunrv cu i vkjv,&^^ r\9 

GDLLVTG S AIX5 RVLMYSPRTASRACTLOGKTQ ACVGTTYH P V LP 
SV7-ATCS WGGDMKI WH*AFHWLSLGEA3 GDLA PARG V SG PGR S L 
K5FS PSXSbLVLLCGRAMFCPATCPWQLPALS K 


5448 

j 

: 

i 


194 


1833 

t 


KASKVTDAIV^YOKKIGAYDQQIWEKSVEOREIKGLRNKPKKTA 
HVKPDLI DVDLVRGSAFAXAKPESPWTSLTTKG1VRVVFFPFFF 
R VCWLQVT S KV I FFW LLVLYLLQVAAI VL FCSTSS PHS I PLTEVI 
GP j WLMLLLGTVRCOI VSTRTPKPPLSTGGKRRR KLS KAAHLEV 
K REG DGS S TTDNTQ EG AVQNHG TSTS K S VGTV FR DL rlAAF F LS 
GSKKAKNSIDKSTETDNGYVSDDGKKTVKSGEDGICN'HFPOCET 
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BNS'OCID: <WO 0153: "2A1 J_> 



WO 01/53317 



ID 
NO: 


Predicted 
beqirjiinc 
nucj set jde 
i oca L ^ or. 
corresponding 
to Eire; 
amino <-cid 
residue c: 
amino fccid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Amine- acic segment containing signal pep r i ca~~ 
(A=A':^nine, C=Cysteine, D=Aspart-<~ Acid, E. 
Glutr.nic Acid, F= Phenyl a ] anine , G^Giycint . 
H=Hi5 t idme, l^Jzoleucine, K^hy b inh. , 
L=Leurine, M=Methionine , N=Asparao ine , 
P=Proiine, Q-Glutamine , R=Argininc, 
S-Serme, T= Threonine, V= Valine, 
W=Tryptophan , Y^Tyrosine, X^Unknown, •=Siop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETA^WGTLRNGPSKDTORTIT^SDEVSS'clGPETGVSITI 
RRHVTRTSEGVLRNRKSHHYKKHYPNEDAPKSCTf CSSRCSSSR 
ODSE£A-RPESETEDVLKEDLLHCAECHSSCTSETrVENHQ]N T PC 
VKKEyKDDPFHOSHL^LHSSHPGLEKISAIWIEGNDCKKADy^S 
VLE ISGH1MNRVNSH1 PG1GY02 FGNAVSLJ LGL7PFVFRLSQA 
TDLEOl.TAHSASELYVlAFGShTEDVIVLSKVI 3 S7VVRVSLVWI 
FFFLLC V AERTY KQVGI M* TSEGVLRNRKSHHY K K HY f tfEDAPK 
SGTSCrSRCSSSRODSESARPESETEDVLWEDLLKCAECHSSCT 
SETDVENHOINPCVKKEYRDDPFHOSHbPWLHSSJIPGLEKISAI 
VWEGMECKKADMSVLE1 SGM I MNR VNSH I PG1 GYC 1 FGMAV S LI 
LGLTFr VFRLS0ATDLEQLTAHSASELYV1AFGSKEDVJVLSMV 
I ISFW R VSLVW1 FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MASKVTDA1VWYOKK1GAYD001WEKSVEQRE3KGURNKPXKTA 
HVKPD1IDVDLVRGSAFAKAXPEEPVJTSLTTKG3VRWFFPFFF 
RWWLOVrSKVIFFWLLVLYLLOVAAIVLFCSTSSPHSIPLTEVI 
GP1 WLKI XLGTVHC01 VSTRTPKP PLSTGGKRRRK LRKAAHLEV 
HREGIX3SSTTD>nX>EGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAK.NSIDKSTETDNGYVSLDGKKTVJCSGEDGJ ONHEPQOET 
IRPEET/iWNTGTLKNGPSKDTQRTlTNVSDEVSSEEGPETGYSL 
RRHVDK TSEGVLRNRKSHH YKXHY PNEDAPKSGT5: CSSRCSSSR 
gDSEKARPESETEDVLWEDLLHCAECHSSCTSETrVENHQlKPC 
VKKEYF.DDPFK0SKLPKLHSSHPGLEKISA1WEGNJDCKKADMS 
VLEI SGM J IPG J GYOIFGNAVSLILGLTrf-'VKRLSOA 
TDLEQLTAH£ASELYVIAFC£NEDV3Vl,SMVIlSF\'VRV$lvVWI 
FFFLLCVAERTYKOVGIf:*T$EGVLRNRKSHHYKKHYPNEDA?K 
SGTSC5SRCSSSR0DSESARPESETEDVLWEDLbHCAECUS£Cr 
SETDVENHOINPCVKKEYRDDPFHCSHLPWLHSSHPGluEKISAl 
VWEGNPCKKAXMS VL.E1 SGMlMNRVNSH I PG1 G YO UGNAVSL 1 
LGLTPFVFRLSQATDLEOLTAHSASELYVlAFGSIvrDVIVLSKV 
IISFWKVSLVWIFFFIXCVAERTYXOVGIK 


$450 


813* 


1242 


G00FAS F FG * NHP EVT VAMALTDl DLQLOFSMSOr- E A1»1,LLAAG 
PADHLLbQLYSGHLQVRLVljGOEELRLOTPAETLLr.DS I PH7W 
LTVVEG^ATLSVDGFLNAfSAVPGAFLEVPYGLFVGGTGTLGliP 
YLRG7SRPbRGCI>T4AATLNGRSl,LRPl>TPDVHEGCAEEFSASDD 
VAbGFSGPHSLAAFPAWGTQDEGTLEFTLTTOSFOAPlAFOAGG 
RRGDFI Y VP I FEGHLRAWEKGQGTVLLHNSVPVAbGQPHEVSV 
KINAHRLEISVDOYPTHTSNRGVLSYLEPRGSLLZ^GLDAEASR 
HL0EHRU5LTPEATNASLLGCMEDLSVNG0RRGLREALLTRWKA 
AGCRLEEEEYEDDAYGKYEAFSTIAPEAWPAMELFEPCVPEPGL 
PPVFAN FTQLLT1 SPLWAEGGTAWLEWRH VQPTLE ^EAELR K 
5QVliFSVTKGAHYGELEIJ)ILGAOARKMFTLLDVV7^KKARFIHD 
GSEDTSDOLVIjEVSVTARVPMPSCLRRGQTYLLPICVNPVNDP? 
HI I FPKGSbMVI LEHTQKPLGP EV FOAY DPDSACEG LTFOVLGT 
SSG L»P VER R DQPG EPATEFS OREL EhGSLVYVHCGG PAQDL TFR 
VSDGLQASPPATLKWA JRPA1 01 HRSTGLRIACG^AMP JLPAN 
LSVETNAVG<)DVSVLFR\rrGAliQFGELOKHSTGGVEGAEWWATQ 
AFHQRDVEQGRVRYLSTDFOHHAYDTVENliALEVOVGOEJLSNL, 
SFPVTlGRATWMiRLEPLHTONTQOETLTTAHLEATLEEAGPS 
PPTFHYEVVOAPRKGm^LOGTRLSDGQGFTODDlCAGRVTYGA 
TARASFJ^VEDTFRFRVTAPPYFS PLYTFPI H I GGDPDAPVXTNV 
LLWPEGGEGVLSADHLFVKSLN^ASYLYSVMERPKLGRLAWRG 
TODKTTMVTS FTKE PLLRGRLVY QHDDSETTEDD3 P FVATRQGE 
SSGDMAWEE VRGVFR VAIQPVNDHAPVQTI $rt FHVARGGRR hh 
TTDDVAFS DADSGFADAQLVLTR KDLLFGS I VAVDE FTR PI Y R F 
TQEPLR KR R VLFVHSG APRGW 1 QLOVSDGQHOATALLE VQAS EP 
YuRVAJ^GSSbWPOGGQGTIDTAVLrO.DTKLDIRSGDEVHYKVT 
AGPRWGOL VRAGOPATA f SQQDLLDGAVLYSHNGS LS P 5DTMAF 
S VEAGPVHTDATLQVT I ALEGPLA PLKLVRHKK 1 YV FOGEAAE I 
RSDQLEAAQEAVPPADI VFSVKS P PS AG YLVM VS RC ALADEP PS 
LDPVOSFSQEAVDTGRVLYLKSRPEAWSDAFSLDVASGLGAPLE 
GVLVELEVLPAAIPLEAONFSVPEGGSLTLAPPLLRVfGPYFPT 
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PCT7V]S*M)/342(>3 



SEQ 
ID 

NO : 


Pr.'.-cicttc 
bee - nninr 

<=:Ot i nr 
J ovat jor. 
corresponning 
tc first 
cm: no acic 
reFicue ct \ 
cn:nc acic 
sequence , 


" predicted enc 
nucleotide 
location 
ccrresponcir.c 
to first 
amino acic 
residue oi 
amino acic 
sequence 


Amino acid segment containing signal peptic*- 
(ABAlarnne, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F^Phenyi alanine , G=Glycine, 
K=Histidine, I-Isoleucine, K- Lysine, 
L= Leucine, M^Methionine , N^Asparaaine , 
F-Froline, O^Glutamine, R=>ArQinint, 
f=Scrine, T= Threonine , V-Valint, 
K= Tryptophan, Y=Tyrosine, X-Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion! 








LLGLSLOVLEPP0MGPLQKSDGP0ARTLSAFSWRMVEEQL1RYV 
HDGSETLTPS FVLMANASEMDROSHPVAFTVTVLPVNDQP? : LT 
TNTGLOMWF-GATAP JPAEALRETDGDSGSEDLVYTIEQPSNGRV 
\O.RGA?GTEVRSFTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEK 
TSPGHFFRVTAQKQVLLSLKGSCTLTVCPGSVQPLSSQTLRASS 
SAGTDPQL1 .1,Y RWRGPQLGRXF KAQCDSTGEALVNFTQAEVYA 
GN I LY EHEM P PE P F WEAHDTLELQLS S P P ARD VAATLA V AV S FE 
AACPCRPSHLWKNKGLWVPEGORARITVAALDASNLLASVPSPQ 
RSEHDVLFOV^OrPSRGQLLVSEEPLHAGQPHFLOSOLAAGOLV 
Y AHGGGGTOODGFH FRAHLCK5 PAG AS VAG POTS E AFA1 T VRDVN 
SRPPOPQASVPLRLTRGSRAPISRAQLSVVDPnSAPGElEYEVC 
RA.PHNGFL5LVGGGLGPVTRFTOADVDSGRLAFVANGSSVAGIK 
0LSMSDGASPPLPMSLAVDILPSA3EVQLRAPLEVP0ALGRSSL 
S00OLRVVSDREEPEAAYRLIOGPOYG>JLLVGGRPTSAFSOFQI 
DQGK W FAFTN FS SSHDH FR VLALARG VNAS AWNVT VRALLHV 
KAGGPWPOGATLRLDPTVLDAGELANRTGSVPRFRLLEGPRHGR 
WRVPRARTEPGGSQLVEQFTOODLEtXSRLGLEVGRPEGRAPGP 
AGDSLTLELW/iOGVPPAVASLDFATEPYNAARPYSVALLSVPEA 
ARTE AG K PES S7PTGEPG PKASS FEPAVAKGG FLS FLEANMF S V 
IIPMCL^LLLALILPLLFYLRKRKKTGKKDVOVLTAKPRNGLA 
GDTETFRKVEPGOAIPI,TAV?GOGPPPGGCPDPELLOFCRTPNP 
ALKNGQYWV 






2274 


RDSSECGRTGDTLGRPSACMDA1.KPPCLWRNHERGKKDRDSCGH 
KNSEPGSPHSLEALRDAAPGOGLNPLLLFTKWLFIFNFLFSPLP 
TPALICI LTFGAAI FLWL1 TR PQ P VL P LLD LKN OS VG I EGG A.R K 
GVSOKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
PYRWLSYKOVSDRAEYLGSCLLHKGYKSSPDOFVGIFAONRPSW 
I ISElACYTYSMVAVPLtYDTliGPEAl VH IVNKADI AMVICDTPC 
KALVLIGNVEKGFTPSLKVX I LME-PFDDOLK^RGEKSGIEILSL 
YDAENIiGKEHFRKPVPPSPEDLS VI CFTSGTTGDPXGAMI 7HQNT 
I VSNAAAFLKC VEHAYEPTPDDVAI S YL PLAHM FER I VQAW YS 
CGARVGFFQGDIRLLADDMKTLKPTLFPAVPRLLNRIYDKVQNE 
AK7 PLKKFLLKLAVSSKFKELQKG J J RHDSFWDKLI FAK2 QDSL 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCOVYEAYGOrECTGGCT 
FTLPGDWTSGHVGVPLACNYVKLEDVADMNYFTVNNEGEVCI KG 
'JTA^FXGYLKDPEXTOEALDSDGWI^TGDIGRWLPNGTLKI JDRK 
KM FKLAQGEY 3 APEKI ENI YNR SQPVLQI FVHGE SLRSS LVGV 
WPDTDVLPSFAAKLGVKGSFEELCCWQWREA2LEDLOK3GKE 
SGLKTFEOVKA1FLHPEPPS1ENGLLTPTLKAKRGELSKYFRTO 
IDSLYEHIQD 


i-4S2 


1831- 


2138 


SRVPSLCLSLSLSLSPSREPVAGAPGCGTAGPPAMATLWGGLLR 
LGSLLSLSCLALSVLLIjAQLSDAAKN f EDVRCKC 1 cpp y ken sg 
KiyNKNISOKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI 1 1 YLS3LGLLLLYMVYLTLVEPILKRRLFGHAQL1 QS 
DDDIGDHQPFAJ^AHDVLARSRSRANVLNKVEYAQORHKLQVOEQ 
RKSVFDRHWLS 


S453 




1520 


ps i paavpos appephreetvtatatsqvaooppaaaapgeoav 
agpapstvpsstskdrpvsopslvgskeeppparsgsgggsake 
p0eersqq0dd1 eeletkavgmsndgrflkfdi eigrgsfktvy 
kglbtettvevawcelodrkltkeerokfkzeaemlkglqhpni 
vrfydswestvkgkkcivlvtelmtsgtlktylkrfkvmkikvl 

RS NCR 01 I>KG LQ FLHTR TPP I IHR DLK CDN I F I TG PTGS VK 1 GD 

lglatlkras fa ks vi gtpefmapemy eeky des vdvyapgmcm 
lematseypysecqnaaqiyrrvtsgvkpasfdkvaipevkeii 
egc1r0nkx>erys2 kdllnhaffqeetgvrvelaeeddgek1 ai 
klhlir i ed i k klkgkykdnea1 e fs fdlernvpedvaoemve sg 
yvcegdhktmakai kdrvsl1 kr kreqrql * 


^4S4 


in 


1S20 


PS I PAAVPQSAP PE PHRBETVTATATSQVAQOP P AAAAPGEQAV 
A6PAPSTVPSSTSKDRPVS0PSLVGSKEEPPPARSGSGGGSAKE 
P0EERS0OODD1 EELETKAVGMSNDGRFLKFDI EI GRGSFKTVY 
KGLOTETTVE VAWCEUODRKLT KS ERORFKEEAEMLKGLQHPNI 
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BNSOOCID- <WO 0153312A1J, > 



WO 01/53312 PCT/US00/34263 



" "SEO 
ID 
NO: 


Predicted 
bee inn i no 
nvc ieotirie. 
location 
cor re spondi ng 
to first 
am.ir.o acid 
residue of 
ammo acid 
sequence 


Predicted cn^ 
nucl eotidfc 
1 oc^ 1 1 or, 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amino eci3 segment containing signal peptide 
(A= Alanine* C-Cysteir.e, D=Aspartic Acid, E= 
Glutamic Acic, F*?her.ylalanine, G^Glycme , 
H=Histidine , 1 = lsol€'jcine 4 K-Lyeine, 
L* Leucine, M-fccthionane, N T - Asparagine , 
P«-Proline, OGl utamir.e , R=Arginine, 
S« Serine, T=Thieonine, V=Valine, ' 
W=Tryptophan, YsTyrcsine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








VR F Y DSKES TV KG K K CI VLVTELMTSGTLXTY LXR FX VHX I K VL 
PSWCR01 LKGLQFLKTRTPPI 1 HRDLKCDNI Fl TGPTGSVKIGD 1 
LGLATLKRASFAKSVJGTPEFMAPEKyXEXYDBSVDVYAFGMCM i 
LEMATSEYPYSECONMQiyKRVTSGVKPASFDKVAlPEVKElI 1 
£GClRONKDERYSIKPLLNKAFFOEt'7TiVRVELAEEDlX5EKIAI , 
KLWLRlFDlKXLKGKYK^NEAIEFSFDLERJJVPEDVAQEMVESG 1 
yVCEGDHKTMAKAl KDRVSLJ KRKREQRQL* 


54S5 
54 56 


1359 


377 


LTMVS PATRXS LPK VKAMDF 1 TSTA I LPLLFGCLGVFGLFRLLQ 1 
KVRGKAYLRNAVW J TGATSC-LGKECAKVFYAAGAKLVLCGRJNG 
GALESLIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
QCFGWDI LVNNAGJ SYRGT1 MDTTVpVDXRVMETNYFGPVALT 
KALLpSK j KRRQGH2 VAISSIOGKMS J PFRSAYAASKHATQAFF 
DCLRASMEQYE I EVTV ISPGYI HTNLS VNAI TADGSRYGVMDTT | 
TAOGRSPVEVAQDVLAAVGKK K KDVI LADLLPSLAVYLRTLAPG 
LFFSLMASRAR KER KSKNf 


2 


2331 


"CGAGLVAAG AVLVL Y PAS RAGE RTR V ?3S PAPSSbPLHSPGACG 
TE VDMDPQR S PLLEV KGN I E LKR PL I KAF SQLPLSGSR LXR R PD 
QM E DGLEFE KKRTRG LGATT K 1 TTSH PRVPSLTTVPQTQGQTTA j 
GKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGG 1 
KKPSKRPAWDLKG0LCDLNAELKRCRERTQTLDOENQOLQDOLR 1 
DACKXJVKALGTERTTLEGHL.'VKVOAOAEQGQOELKNLRACVLEL ' 
EERLSTQEGLVOELCKK0VEI/CEERRGLMS0LEEKERRL0TSEA . 
jvt .CQCOAPVA^I .ROFTVAOAA7 .T.TFR FERT.HGLPMERftRI.HNfn . ! 
QELKGNIRVFCRVRPVLPGEPTFPPGLbLFPSGPGGPSDPPTRL 1 
SLSRSDERRGTLSGAFAPPTRKDFSFDRVFPPGSGQDEVFEHIA 
MLVOSALDGYPVCI TT\YGQTGSG K TFTMEGG PGGDPQLEGL I PR 
ALRHLFSVAQELSGOGWTYS FVASYVE I YNfiTVRDLLATGTR KG 
0GGECElRRAGPGSEELTVTNARYVPVSCEKEVt)ALLHLARO>IR 
AVARTAQNERSSRSHSVFQLCU S G EH S S RGLQCG APLS LVDLAG 
SERLDPGLALGPGERFJRLRETCA3NSSLSTLGLVIMALSNKESH ' 
VPYRNS KLTYLLONSLCGSAK.MLMFVNI SPLEENV3ESLNSLRF 
ASKVEPSVLFGTAOSKRKVIKTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCFRA 1 X 


5457 


2 


• 54 0 


DDFVERRRWTRTTCLVRSPPHVFVCGHACSWNGGSLDPLXGTPA 
LLRSAERLf^KVKKLRLDKENTGSWRSFSLNSEGAERMATTGTP j 
TADRGDAAATDDPAAR FQVQKHS WDGLR S I IHGSRXYSGLI VNK | 
APHDFQFVQKTDESGPHSHRLYYLGNPYGSRENSLVYSE1PXKV j 
RKEALLLLSWKQMLDHF0A7PHHGVYSK EEELLRER KRLGVFG2 
TSYDFHSESGLFLFQASN5LFHCRIX5GXNGFMVSPGPGCVSPMX 
PLEI XTOC5GPRMDPK J CPAD PAFFS F I NNSDL WVANI ETG EER 
RLTFCHQGLSNVLDDPKSAGVATFVIOEEFDRFTGYWWCPTASW 
EGSEGLXTLR I L Y EEVDES E V E V I H V PS PALEER XTDSYRYPRT 
GSKKPXI ALKLAEFQTDS0GX1 VST0EKELVQPFSSLFPKVEY1 
ARAGWTRDGKYAWAMFLDRPOOWLOLVLLPPALFIPSTENEEQA 
ASLCCS CPG-ECPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


?VF<?LREPOWEPAQPSATMSAPSE£EEYARLVMEAQPEWIjRA£V 
KRLSKEIAETTREXIOAAEYGI^VLEEKHOLXLOFEELEVDYEA 

irsemeolkeafgoahtnkxxvaadgesrseslioesasxeqyy 
vfxvlel>0telxolr>?vltnt0s enerlasvaqelxeinqnvel 
qrgrlrddikeykfrearllodyseleeenisloxovsvlrono 
vsfeglkhei krleeeteylhsqledai rlkei s erqleealet 
lktereqkns lrkel5 hyms i ndsfyt£hlhvsldgi»xfsddaa 
efwtoaealvngfehgglaxlpldnx'tstpxxeglappspslvs 

DU^ELNlSEigXLKOOLJ^r^REKAGLlATLQDTOKQLEHTRG 
SLSEOQEXVTRLTENLSALRRLOASXERGTALIWEKDRDSHEDG 
DYYEVDI NGPEILACXYHVAVAEAGELREQLXALRSTHEAREAO 
HAEEXGRYEAEGOALTEKVSLLEXASROPRELLARLEXELXXVS 
DVAGETC^SLSVAQDELVTFSEELANLYHHVCMCNWETPNRVML 
DYYREGQGGAGRTSPGGRTS PEARGRRS P I LLPKGLLAPEAGRA 
DGGTGDS S PS PCS SLP S PLSDPR R EPKN I YNLI Al I RDQ1 XHLQ 
AAVDRTTELSR0R3ASQELGPAVDXDKEALMEEILXLXSLLSTX 
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BNSDOC)D:<WO._0l533t2A1_l > 



WO 01/53312 



K;T/U$«m/342<>3 



ID 
NO . 



546C 



Pre 3 i cted 
bee i nninc 
Nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
Jocaticr. 
corresponding 
to first 
amino acid 
residue oi 
amino acic 
seaue.nct' 



316 



1262 



: Amine acia" :;eon;ent containing signal peptide" 

I (A=Alanine, C-Cysteine, D=Aspartic Acid, E= 

• Glutamic Acic. F - Phenyl alanine, G-Glycine, 

I H=Kistidine , lwifcleucine, K^bysine, 

j L=Lpucine, KsKetrucnine, K=Aspara°ine , 

I P=Proline, C=Giut amine, R-Arginine, 

| S^Serine, T»- Threonine , V^V^line, 

I W=Tryptophar», Y = Tyrosme, >UUnknovn, *^Stop 

1 Cccon, Apossifcle nucleotide deletion, 

j Vpossible nucleotide insertion) 

j R EQ 1 TTLRTVT.KAN KQTAEVA LAN LK S KY ENEKAMVTE TMMKLR ' 

I NELKALKEDAA7FS £ LRAMFA7R CDE Y I TQLDEMQR QLAAAEDE 

I KKTLNSlilaRKAlQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 

/ PATPSVSHT&.CkSPRAEGTGlJJlQVFCS EKHSIYCD 

I RGGHRLSGMASTTfKD j VKOGW RIRSRKLGIYQrTcWLVFKKASS" 
! KGPKRLEKFSDERAAyFRCYHKVTELNKVKNVARLPKSTKKHAl 
j GiyFNDDTSKTFACESDLEADEWCKVl.OMECVGTRlNDISLGEP 
I DLlATGVERE02ERFNVYt^PSPWLGCYMGECALQlTYEYICLW 
DVOK T PRVKLI£WPLSALRRYGRDTTWFTFEAGRMCETGEGLFJF 
QTRDGEAIYOK^SAAUAIAEOHERLLOSVKNSMLOKKMSERAA 
SLS^VPLPRSAYWOKITaOKSTGOLYRLQDVSSPLKLHRTETF 
PAYRSEH 



45 



2097 



5461 



1481 



160 



"T63" 



"33ST" 



5463 



23*? 



1011 



KPGCRAGELSTGSKARERVRJvRVSAPCC-ODSRRCDFEVL,RGRSP 
GLGLAtMPSCGACTCGAAAVRLlTSSIjASAQRGlSGGRIHMSVL 
GRLGTFETQ1L0RAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
AS EG S S KKSG S GNS G K.GGNQLR C P K CGDLCTHV ETFV S STRPV K 
CEKCHKFFWLSEAPSKKSI I KEPESAAEAVXLAFOQXPPPPFX 
Kl YN Y LDKYWG OS FA X X VLS VA V YNH Y KR I YNNI PANJUR QQAE 
VE KQTS LTPRE LEI R RR EDEY R FT KLbQ 3 AG I S PHGKALGASMQ 
CQVNQQI PCEKRGGEVLDSSHDD I KLE KSNI LLLGPTGSGXTLL 
AQTLAKCLDVPFAI CDCTTLTQAG Y VG ED 1 ESVI AXLLQDANYN 
VEXAO0GI VFL.DEVCK 3 GSVPG3 KOLRDVGGEGVQQGLLKliLEG 
TI VNV PEKNSR KLRGETVOVDTTN I LFVASGAFKGLDR 3 I SRRK 
NE KYLG FGTPS K LG KG R RAAAAADLANR SG ESSTHQD 3 E EXDRL 
LRHVEARDLIEFGMIPEFVGRI.PVVVPLHSLDEKTLVOILTEPR 
NAV 1 PQV QALFS HDXCE WvTITDALXA3 AJRLAbERXTG ARGLRS 
IKEKJ.LLEPMFEVPNSDIVCVEVDKEWEGXKEPGY3RAPTKES 
SEEEYDSGV EEEGWPR QADAAKS 

"iNPPPPPK SPCG RAJ* K WRRErR PG APE AAVWELP SGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFXVSCALPD 
SV3.RR F\An?TMCAVLGbVAROEDSGLRDHS VRVLI SNHVTPFDH 
>33VNLLTTCSTPLL.NSPPSr VCWSRGr MEMNGRGELVEELKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWOPLTL 
QVQRPLVS VTVS DASWVS ELLWSLFVP FTVYQVRKLRF VHRQLG 
EANE E FALRVOOLVAKE LGQTGTRLTP ADKAEHMKRORH PRLRP 
OSAOSS FPPSPG ?S PDVOLATbAOR VKEVLPHVPIjGVI CRDLAK 
TGCVDLTITftLLEGAVAFMPEDlTXGTOSLPTASASXFPSSGPV 
TPCPTALTFAKy^WARCESLQER KQALYBYAKRRFTERRAQEAD 

"XI KERCMSANK SFP SAC-KSVLPTA3 PAVLPAASPCSSFXTGLSA 
R LSNG S FS APS L?N S RG S VHTV S FLLQ I GL»TRE S VT I EAQELS L 
SAVXDLVCSTVYQXFPECGFFGMYDXI JXFRKDMNSEN1 LQLI T 
S AI)E I HEGDLVEW bS AJLATVEDFO 1 R PHTLYVHSYKAPTFCDY 
CGEKLWGLVRXXjLKCEGCGLNYHKRCAFXIPNNCSGVRKRRLSN 
VS bPG PGL>S VPR PLQF E YVALPSEESHVHQEPS XR I PS WSGRP I 
WMEKM VMCR VKV PHTFAVHS YTRPT3 CQYCXRLLXGLFRCGMQC 

xbctcft^chxrcaskvfrdclgevtfngepsslgtdtdipkdidn 
kdinsdssrgldd?eepsppedkkffld?sdldverdeeavkti 
spstsnni plmr wos3 xktxr ks s t^/xegwmvhy tsr pnlr k 
rhywrldsxcltlfqnesgskyyxeiplseilrissprdftnis 
ogsnpbcfe ii tdtmv y fvgenngdsshnpvlaatgvgldvaqs 

WSKAIROALMPVTPOASVCTSPGCCXDHXDLSTSISVSNCQIQE 
NVD1S TVYQIFADE VhGSCQFG I VYGGXHRKTGRDVA I XVI DKM 
R ? P TXO E S QI>RK EVA I LQN LHH PG I VN LECMFETPER VFVVME K 
LHGDMLEMII^SEKSRLPERITKFWVTOILVALRNLHFKNIVHC 
DLXPEKVLiiAS AE? F POVKLCDFGFAR 1 1GEKSFRRSWGTPAY 
LAPEVLRS XGYKRSLDMWSVGVI I YVSLSGTFPFNEDEDINDQI 
QNAAF M YP PNPW RE I SGEAI Dhl ;JNLLQVKMRXR YS VDXSLSH P 
W LQDYQTWLDLR EPETR I GERY I TRES DDARWE2 HAYTHNLVY P 
KHFlWAPNPPDMFEDF 



Ll^^^TSRCSKLPEVLPDCTSSAAPVVXlVEDCGSLVKGQPQ 
YVMQVSAKIXy>LLSTVVRTLATQSPFNDRP«CRICHEGSSQEDL 
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NO : 


f r relict* r 
beginnir.t- j 
r.ucleot. :cf 
lccatio:. 
ccrrespc r.di r.c 
to firr; 
amino arjc* 
residue c : 
amino ac : c 
ecquenci 


Prfcictec end 
nucj rot i dc 
jocfif.ior. 
con t t:pondi ng 
to f^rst 
anino scic 
r«s.}?.ue oj 
amnio acid 
reqv.-nce 


/"kTTjino acid segment containing sicnal peptide ' 
(AxAlaninc, C=Cyste-ne, D=Aspartic Acid, E- 
Glutamic Acid, F=Fhenyla3c>nine, G-^Giycirfc . 
H=Histidme, I = 3 scjeucine , K-Lysme, 
L- Leucine, K*Mer hionine, N=Asparagine , 
P=Pro3ine, OGlut amine , R-7\rginin<= , 
S-Serine, T^Threonine , V«Valine, 
W= Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, / ^possibyc nucleotide deletion, 
\-possible nucleotide insertion} 




i 

i 
1 


I^PCECTGTLGTIHRSCLEHWLSSSNTSYCELCHKRFAVERKPR I 
PLVE^LRWPGPOHEKRTI.FGDMVCFLFITPMTISGWLCLRGAV 
DHLHFSSRLEAVGLlALTVALFTIYl.FWTbVSFRYHCRLYWEVm 
RTNQRVI L.LI PKSVIWPSNQPSLLGLHSVKRNSKETVV 




677 


SPSMNPRKKVDbKLI I VGA3GVGXTSLLHQYT/HXTFYEEYQTTL 
GASIkSKI 1 1 LGDTTliKLOl WDTGGQERVRSMVSTFY KGSDGC J 
LAFDVTDI iESPEAI jD 1 W RGDVLAK I V PMEQSY PMV LLGN X 1 DLA 

DRKYOSILENH ltesi klspdosrsrcc 


5<:C5 


52,1 


.rj4e 


"KGDPREF I RVH REALE CDYVSAHLHEK 3 DL3 FGYXQOGPAAVE A 1 
VNVFHKLFYEGQVD J YTJINDPLKETAT1GFINNFG0IPKCLFKK 
PHP^KRVRSRLNGDKAGISVLPGSTSDKIFFHHUJNLRPSLTPV 
KELKEPVGOI VCTDXG JIiAVEOWKVTjIPPTWNKTF/.WGYADLSC 
RLGTVESDKAMTVYECLS EWGQI LCAI CPNPXLV1TGGTSTWC 
WEMGTS K EKAKTVTLKQALLGHTDTVTCATAS l*AYH 1 1 VSGSR 
DRTCIIWDLWXI^SFiTOLRGHRAPVSALCINELTGDaVSCAGTy 
IHVWSINGNPI VS VMT FTGRS Q0 1 1 CCCMSEWEWDTQM VI VTG 
HSDG WR FVIR14EFL.CV P ETF A PE P AE VLEMQED CPE AQ 1 GQEAQ 
DEDSSDSEADEOSI S0DPKDTPSOPSSTSHRPRAAS CRATAAKC J 
TDSG5DDSRRWSD01.SLDEXDGFIFVNYSEGOTRAHLOGPLSHP 
HPNPI EVRNYSKLKPGYRWERQLVFRSKLTMHTAFDRKPNAHFA 
EVTALGISXDHSR1 lA'GDSRGRVFSWSVSDQPGRSAADHWVKDE | 
GGDSCSGCSVRF SLTERRHHCRNCGOLFCQKCSRFOSEI KRLK1 
SSPVRVCQNCYYNLOHERGSEDGPRNC | 


54 U 6 






HACAKASAJ 1ASG RL VR W WR KK RS VMG I C/TS PVLLAS LG VGL VI L 
LGLAVGSYLVRRSRRPOVTLLDPNEXYLLRLLDXTTVSHNTKRF ; 
R FALPTAHHTLG LPVG KH I YLSTR I DGS LV I R P Y TP VTS DEDQG | 
YVDLV 3 KVYLKG VH P KF P EGGKMSQ YLDS1>KVGDWE FRG PSGL . 
LTYTGKGHFNIOPNKKSPPEPRVAKKLGMIAGGTGITPMLQI.JR 
A3 LKVPEDPT0CFLLFAN0TEKD1 IL>REDLEELOARY PNRFXUJ I 
FTLDHPPKUHA YSKC5FVTA3M3 R EHLPAPGDDVL VLLCG P PPMV 
OLACHPNLDK1X5YSCKMRFTY i 


54 t 7 


210. 


4 


GEALR VGTPGCR RCLPDPOAR I F 3 QKKDLEEDES VTAAJILKS RG 
RSPRKIDOFCNSSNMVUGGVTFRDVAIDFSQEEMECLOPDQRTL 
YRDVMLEN YSHL J SLAGSS I SKPDVI TLLEQBKEPWMWR KETS 
RRYPDLELKYGPEKVSPEITDTSEVNLPKQVIKOlSTTTjGl EAFY 
FRNDSEYROFEGL0GYCEGNINQKM3 SYEKLPTHTPHASLIOJT 
HKP Y E CKECGK Y FSCGS N L 1 QHQ5 1 HTG EK PYK C KE CG KA FQLH 
I0LTRH0KFHTGEKTFECKECGKAFNLPTQLNR1IKNIHTVKXLF 
ECXECG KS FNR S SNL.T0HQS I HAGVKP YQCKECGKAFNRGSNbl 
OHOKIHSMEKPFVCKECGMAFRYHYQL J EHCQ3 HTGEKFFECK3 
CGKAFTLLTKliVRHO K2 K?GEKPFECRECGK*FSLLNQLNR H KN 
IHTGEKPFECKECGKSFNRSSNX>VQH0SIHAG3 kpyeckecgkg 
FNRGAHLIOK0KIHSNEKPFVCRECEMAFRYHC0LJ EHSR IHTG 
DKPFECODC^XAFNRGSSLVOHQSIHTGEKPYECKECGKAFRLY 
LQJiSOHOKTHTGEKPrECKECGKFFRRGSNLNOHRS I HTGKKP? 
ECKECGKAF^LHMHl>IRHOKLHTGEKPFECKECGKAFRLHMOLI 
RHOKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 


225 


*976 


S FLTDLFQS LAQLEN LGKQLYETTDTTTRLOAEKAiVE FTNS PD 
CLSKCOLLbERGSSS YSOLLAATCLTKliVSRTNNPLPLEORI DI 
RNYVLN YLATR PKLATFVTOALIQLYAR I TKLGKFD CQKDDYVF 
RNAI TDVTRFLODSVEY C I IGVTILSQLTNEINQVSATAELI EA 
TYTTH D T ,T »T-I R K T A <; Q FR D ^ LFD I FTLS CNLLKOASG KNIjNLND 
ESQUGLLMQLLKLTHNCLNFPFI GTS TDESSDDLCTVOI PTS WR 
S AFLDS STLOLS TIGRCEYE KTCAXLVO liFDQSAQS Y QE LLQS A 
SASPMD I AVOEG R LT WLVY 3 3 G AV1GGR VS FASTDEODAMDGEL 
VCRVL0LMNLir?SRI^GAGNEKLELAMI>SFFEOFRK:YlGD0VQ 
KSSKLYRRL.SEVLGLNDETMVLSVFIGK 1 1 TNLKYWGRCEPI TS 
KTLQLLNDLS 1 GV £S VRKXVKLSAVQFMUWHTSEHFS FLGI NN 
QSNLTDMRCRTTFYTALGRLIjMVDIjGEDEDOYEOFMLPLTAAFE 
AVAQM FS 7N S FN E QE/S XR T L VG LVRDLRG I AFAFK A KTS FMtf L F 
EWIYPSYMP J LCFLAI ELWyKDPACTTPVLKXMAELVHNRSQRLO 
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PCT/US0»."342A3 



"SEQ 1 
iD 
HC: 


Predicted | S'Ytdictec end 
beginning I nucleotide 
nucleotide : location 
location i correspcndmg 
cor respondmc j to first 
Lc first : £.m:no acac 
cXsino acid residue of 
residue of 1 em: no acid 
omino acic < seuuer.ee 
sequence , 


Ami ric acid sccinent containing signal pt-otiae 
(£-- Alanine , C-Cyvleme, D=Aspartic Acic, h- 
Glurfimic Acic, F- Phenylalanine , G>-Glyci:.^ 
K = H i r.t id; ne , 1 = Ifcleucine, X=Lysine, 
L= Leucine, M»Kethionine, N-A^paragine , 1 
P = Prcline, 0-=Gl u;aroi ne , R-Arginine, '■ 
i-ilcnne, T= Thr e<;:unr , V-Val^ne, 
W-Trypt: cphan, Y = Tyiosine, X = Unknowr, , *ri;tcp 
Coder., / = possibie nucleotide deletion, i 
\ -possible nucleotide insertion) 




1 
i 

| 

1 
1 


FDVSSFNGI LLFRETSKKJTMYGNRILTLGEVPKDOVYAIjKLKG 
2 Si C FSMLXAAl .- S G S Y VN FGV F R LY GDDALDN ALQTP1 K 1/LLS 3 
P » :' S D LLD Y PKLSC-YYSI.LEVLTQDHMNK1AS LEPH V 1 K V I bSS 
JSEG LT'ALDI'HVCTGCCSCLDKl VTYLFKQLSRSTKKRTTPIjNO 
ESDR FLH 1 MO0H FEM 3 OCMl.STVLNI 1 1 FEDCRNQWSMS K P LLG 
LIIiUNEKYFSnbRNSlVWSOFPEKQOAMKLCFENLMEGrtRNLL 
T K.N R OR FTQN LS A r R R I VNDSMKNS TYGVUSNDMMS 


lies 


134 


2653 


" DCeTfTS LV P KH Lr MGWL CSG LL F PVSCI . VLLQ VAS5 GNM X VI Q 
FPTCV?:DYMS.ISTCEWKMNGPTNCSTELRLLYObVFL.LS?:AHTC 
VFErWGGAG CVCHI LMDDVVS AENTYTLDLWAGQOLLWKG 5 FKPS 
EHVKPRAPGNLTVKTNVSPTLLLTW5WYPPDNYLYNHLTYAVN 
1 WS ENDPAD FR J YNV'H LEPSLR 1 AASTI ,KSG I S YRAR V R A W AO 
CYK7TKS EWS PSTKWHKS YREPFEQHLLLGVSVSCIVJLAVCLL 
CyVSlTKlK KEKKDQ I FNPARSR LVA 1 1 1 QDAQGSQWE KfcS RGQ 
F.FAKCPKWKNCLTKLLPCFLEHNMKRDEDPHKAAKEKPFOGSGK 
SAWCPVEISKVVLWPE£:£VVRCVEL?EAPVECESEEEVFEFKG 
SFCASPESSRDDFCEGKEGIVARLTESLFLDLLGEENGGFCOOD 
MGESCLL>PPSGSTSA»!P\OEFPSAG?KEAPPWG'AE0PLHLEP5 
PPASPTQSPDNLTCTETPLVIAGNPAYRSFSNSLSQSPCFRELG 
pnPblARHLEEVEPEMPCVPQLSEPTTVPOPEPETWEQJ LRRMV 
] »0H GA« AAP V.SA PTSGYQEF\/HAVEC<5GTQASAVVGLGPPG EAG 
VKAFSSIjLASSAVSPEKCGFGASSGEEGYKPFQDIjIPGCPGPPA 
PVPVrLFTKGLDREPPPSPOSSHLPSSSPKPILGLEPGEKVEDMP 
KFPLPOEOATDPL.VDSLCSGlVYrAbTaiLCGKLKQCHGOCDGO 
U'lPVKASPCCGCCCGDRASPPTTPLRAPDFSPGGVPbEA^LCPA 
SLAPSGT SEKSKSSSSEK PAPGNAQ S SSQT F K 2 VN FVSVG FTYM 
RVS 


s«;7o 

i 

i 


17 


34 28 


TArRjKl\SLNRGlA^VK):OAVE«!ASYGLAy-SLMKFFTGPMSDP 
K KVG 1 .V FVNS KRDRTKAVLCMWAG A3 AAVFKTL I AY SDLG Y Y 1 
INKbH^VDESVGSKTRRAFLYlAAFPFMDAf'lAWTHAGILb^KY 
SFLVGCASIS DVI AOVVFVAILLKSHLECftEPLLlPILSLYMGA 
LVPCTTL.CLGYYK.N1 HD] I PDRSGFELGGD/iTI RKMLSFWWPLA 
LI 1 JkTQR 1 SK P 1 VNLFVS RDLGGSS AATEA VA 1 LTAT YPVG HM P 
YGKLTEI RAVV PAFDKNNPSNKL»VSTSKTVTAAH I KKFTFVCMA 
L-SLTLCFVMFWTPNVSEXJLID1 IGVDFAFAELCWPLR3 KSFF 
PVPVTVRJ\HLT(iWLMTLKKTr VliAPSSVLRl IVLIASLVVLPYL 
GVRG A7LGVGS LLAG FVC-ESTKDAI AACWYR KQKKKMENE SAT 
EGEDSAMTDMPPTEEVTDJ VEMREENE 


5471 

! 


1868 


656 


r £ s a p pg pqraaaa taaaaaag vemaaaaaggggggeprrteg v 
gpgvpgevemvkgqpfdvgprytqloyigegayg^ssaydhvr 
ktrva:kkispfehotycortlreioillrfrhenvigifdilr 

ASTLE AMRDVY 1 VQDl^ETDL YKLLXSQOLSNDH J CYFLYO 3 LR 
GLKY 1 HS AWLHRDbK PSNLLI NTTCDLXI CDFGI ARIAEP EHD 
HTGFLTE YVAT R W Y RAPE 1 MI ,NS KGYTKS 1 D I WS VGC I LAEKLS 
TOPI FPG KHY LDQLNH I LG 1 LGSPSCEDLNCI INMKAJWLCS L 
PSKTK\'AV:AK1 FPKS DSXALDLLDRKLTFKPNKR 1 TVEEALAHP 
YLEQ Y YD PTDE P V A E E P FT F AM E LDD LP KE R LKEL I FQETAJ\ FO 
PGVLEAP 


5472 


1469 


753 


LYVMARYLSDEEVA.VS I DR LCKANGR SPS I PFGTW1PGRARVR 
DP0ALW3 FGYGSLWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
G S DKMPGR WTLLE DH EGCTWGVAYQVQGEQVS KALKYLNVTiEA 
VLGG YDT K EVTF Y PQDAPDOPLKALAYVATPONPG YLG PAFEEA 
1 ATOI IACRG FSGHNLEYLLRVRDVM0LCGPQA0DEHUAA1 VDA 
VGTMLPCFCPTEOALALV 


5473 


3 


2119 


FMNVKLL I QDLEDl EQRVPVMDAOYKI 1 TKTAHLI TKESPOEEG 
KEMFATMSX1.KEOL7KVKECYSPLLYESOOI*LIPLBELEKOKTS 
FYPSLGKINEUTVLEREAOSSALFKQKHQELLACOENCKKTLT 
L 3 EKGSQS VOKFVTLSNVLXHFDQTRLQROI ADIHVAFQS '4VKK 
TGDWKKJrvErNCRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 
LLRRHTEFFSOLCORVLNAFLKACDELTDILPEOEQQGLQEAVR 
K LHKOWKDLQGGAP YHLLHLKI DVEKNRFLA.S AEECRTELPR ET 



329 



BNSDOC'D: <WC __01S3312A1 .1, r 



WO 01/53312 



PCT/MSOO/34263 



SEQ 
ID 

NO: 


Predict ea 
beg inn i ng 

1 oca t i or, 
corresponding 
to firu: 
amino odd 
residue of 
amir.o acid 
sequent 


Predicted end 
nucj eot i de 
location 
coj- respond i ng 
to iirst 
srr.ino acic 
rec: due c£ 
amino acid 
sequence 


Aminu cc;6 secmen; containing signal ptpt3<3e 
HUAli-r.jne. C=Cysir- ine, D=Aspartic Acid, £=• 
Glut am' c Acid , F= Phenyl a 1 r-ni :.t G--Glycir;t, 
H=Histacine, 1= Isoieucine, K=Lyuinc, 
L=Leuc : r.e , Met hi on ine, N*Arp&ra<ji nc , 
P=Prol;nc, O-GlutaT.i ne , R-Arcir.ine , 
Se&erint, T^Threona ne , V=Val.:nt, 

Tryptophan, y*Tyrosine, XrVnunown, »^top 
Codon, /^possible nucleotide deletion, 
N-poSDibie nucleotide insertion] 








KLMPOECEEKI I KEHRVFFSDKGPHHLCEKRLOLI EEbCVKLPV 
RDPVRD'J I 'G7 GH VTL K E L 3J\A 1 DS T YR K 1 NEDPDKWKDYTSRFS 
EFSSWI TTKETOLKGI K G E A I DTANHG EA'" K RA V E E 1 K NG V T KRG 
E TLS WL K f KLKVIjTEVS S LW^AOKQGDEI J-.KLS S SFKALVTLXS 
EVEKMLSHFGDCVOYKE-VKNSLEELISGSKEV0EOARKILDTE 
NLFEACC'L LLHHQOKTKJR I SAKKRDVOOC 3 AQAQ0GEGGL7DRG 
HEEI.RKLESTLDGLERSRE^OERRIQVTl RKWERFETNKETWR 
YLFOTGrrKERFLSFSSU'SLSSELEOTKEFSKRTESIAVOAEN 
LVKEVXSFIPLGPONKQLLOOQAKSIKEQVKKLEDTLEEEYVIDK 
S 


5474 




780 


TPDVRQhOASRRGIAVASWCSPRWFAGEtKiAFVKSoWLLRQSTl 
LKRWKKNnN*FDI,WSDGHL I Y YDDCTRQN I EDKVHMPMDC1 N I RTG 
OECRPTQ PPDG KS KDCMI.y 1 VCRDGKTI PLCAESTDDCLAWKFT 
LODSRTNTAYVGSAVMTDETSWSSPPPVTAVAAPAPEVGRTLS 
LOOAYGYGPyGGAYPPGTO^/VyAANGOAYAVrYOYPYAGLYGCXJ 
PANQVI T R ER Y R DNDS DLA LGiXLAG AATG KALG S h F WV I 


5475" 




SO* 


ARGWL-ES LSLTC0TTPPPS5 PCLLHSPETFI HTMPPNLTG Y YRF 
VS0KNMEDYLOALNISLAVRKIALLLKPDKEIEH0GNHMTVRTL 

stfrnytvofdvgvefeedlrsvdgrkcctlvtweeehbvcvqk 
gevpkrgv:rhwlegemlylel,tardavcecvfrkvr 


5476 


is: "i 


1457 


SDSMSI.LDCFCTSRTQVESLKPEK0SETS1H0YLVDEPTLSWSR 
PSTRASEVl.CSTNVSHYELOVElGRGFDNLTSVlfliARHTPTGTL 

vtikitnlencmeerlkalokavilshffrjhpnittywtvftvg 

S WLW VI S F FKA YG S ASQt LK T Y FPEGMS E 7 L 3 RIM IIjFGAVR G LN 
YLHONGC Z HRS 2 XASH3 L J £ GDGLVTLSGJ^HLHSLVKHGCRHR 
AVYDFPQFST5V0PWLSPELLRQDLHGYNVKSD1 YSVG37ACEL 
ASG0VPFCDWHRTOMLLQKLKGPPYSPLD7 S3 FPOSESKMKNSQ 
SnVDSGl ^ESVLVSSGTHTVNSDRLHTPSSKT FSPAFFSLVQLC 
LOODPEKRPSASSLLSHVFFKOMKEESODSJI.SLLPPAYNKPSI 
S LPF VLP WTEF ECD FPDEKDE YVJEF 


5477 


> 


1044 


RGNSRLRYEHEDELOLPRIiPELFBTGROM.DEVEVATEPAGSRI 
VQEKVFXGr-DJLLEKAAEKLSOliDLFSRKEDI/EEIASTDbKYLLV 
PAFOGAbTMKOVW PSK RLDH l.QRAR EK F I N Y LTQCH C Y HV A EFH 
LP KTMNN £ AENKTANS SMAY PS LVAMA S OR OAK 1QRYKOK KELE 
HRLSAMX A VESGOADDERVREYYLLHLOP W 1 D3 SLEE1 E£ 3 DO 
ElKILRERnPSREASTSNSSROERPPVKPFlLTRNMAOAKVFGA 
GYpSLPTMrVSDWYE0HRKYGAL.PDOGIAKAAPEEFRKAA0QQE 
EOEEKELEDDEOTLHRAREWDDWKDTKPRGYGNR0NMG 


5478 


2 


835 


KTVR I WV P tfVKGES TVFRAK7ATVRSVH F C S D3QS FVTASDDKT 
VXVWATHR0KFLFSLSOHINVJVRCAKFSPDGRLIVSASDDKTVK 
LWDKS SR E CVHSY CEHGGFVTYVDFH PSGTC1 AAAGMDNTV KVW 
DVRTHRL1,0HYCLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 

M EGRLLY t lhghqg pattvaf s rtgey fae gg sdeqvmvwx s w f 

DlGDHGEVTKVFRPPATLASSMGNLTVSIiEORLTLEEDKLKOC 
LBH0OWWRATF 


5179 


2 


835 


KTVR I WVPNVKGESTVFRAHTATVRSVKFCSDGOS FVTASDDKT 
VKVWATHROKFLF S LSQHINWVR CAKFS PDGRLJ VSASDDKTVK 
LWDKS SR E CVHS Y CEHGGFVT YVDFH PSGTC I AAAGMDNTVKVW 
DVRTHPLLOKYOLKSAAVNGLSFHPSGNYLI TASSDSTLK1 LDI> 
MEGRLbYTLKGHOGPATTVAPSRTGEYFASGGSDEQVMVWKSNF 
Dl GDHGEVTKVPRPPATLASSMGNLTVS 1 LEQRI»TLEEDXLKQC 
LEN0QL3MCEATF 


5480 


444 


1952 


LSLTS FWEEAELVKGRLOA1TDKRK1QEE1 SQK-RXK1 EEDKJLKH 
0HLKKKALFEXWLLDGISSGKE0EEMKKQN00DQH03QVLEOSI 
LRLEKEJ QELEKAELOI STKEEAILKKLKS 1 ERTTEDI2RSVKV 
EREERAEES I EDIYAUI PDLPKSYI PSRLRKEI NEEKEDDSQNR 
KAXYAME1 KVEKDLKTGESTVLSSI PLPSDDFKGTG1 KVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRCASERNSKSPTEYH 
EPVYANPPYRPTTPORETVTPGPNF0ER 1K1 KTNG1G2 GVNESI 
HNMGNGLSEERGNNFKH I S PI PPVPH PRSV 1 OOAEEKLHTPQKR 
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Stv i 
ID | 
NO: 1 


Predicted 
beqinninc 
nucleot dec 
location 
corresponding 

tirst 
amine acid 
residue cf 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
cor respond 1 :ig 
to first 
ammo ncic 
residue of 
amino acid 
sequence 


Amino acid segment containing sicnaT pcpt:ee 1 
(A*Alanine, C<=Cysteine, D^Aspartjc Acid, E=. 
Glut.ami c Acid, F= Phenyl 9 1 anine , G-Gjycir.e, 
H^Hi sti oM ne , I^Isoi euc: ne , K=Lysir:fc, 
L=Leucine, M=Met hioni ne , N^Asparac 1 ne , 
p=Prol;.ne, Q^Glutaminc, R=Arginine, 
S = 5enne, T^-Threonine, VsVt"j:nt ( 
W = Tr-ypt ophan, Y=Tyrosine, Z-Unkncwn, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possiiue nucleotide insertion) 








LMTPWErSNVMODKPAPSPKPRLSPRETIFGKSEHONSSFTCQE 
DEEDVRyWIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGllHAELVVIDDEEEEDEGEAEKFSYHPlAPKSQVYQPAXP 
TPLPFKRSEASPHEKHKS 


S46i 




1422 


NSPGSVCLCOCVCPSLLHCLPPLLLLLLLPLLLHESPQPPALRV 
VATSSDRl^KMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
OGLNEAGDDLE AVAKFLDSTGSR LD YRR YADTLFDI LVAGS M LA 
PGGTR 1 DDGDKTKMTNHCVFS AN E DHET I RN Y AO V FN Kh I R R Y K 
YLEKAFEDEMKKLLLFLKAFSFn'EQTKL/vMLSGlLLGNGTI.PAT 
3 LTS LFT DS IjVKEG I AAS FAV KL FKAWKAE KDANS VTS S LR KAN 
LDK^LLEL.FPVNROSVDHFAKyFTDAGlKELSDFLRVOOSLGTR 
KELOKEbCERUSOECPIKEWLYVKEEKKRNDbPETAVJt^LWT 
C I MNAVE VJNKKEE LVAEQ.A1 K H L KQYAF LLAV FS S OGOS ELI LL 
OKVOEYCYDNIHFMKAFOK1V\'LFTKADVLSEEA3LKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEK 


548;" 


3492 


£28 


TKWMTGMCYAPHQVO.SYINGVTTSKPGVSLVYSKPSRNLSLRL 
EGLOEKDSGPYSCSVNVODKOGKSRGK^IKTLELhTVLVPPAPPS 
CRLOGVPHVGANVTLSCQSPRSKPAV0YOWDRQLFSFOTFFAPA 
LDV I RGS LS LTNLS SSMAGV YVC KAHNEVGTAUCNrVTLE VSTG P 
GAAWAGAWGTLVGLGLLAGLVLLYHPRGKALEEPANDIKEDA 
1APRTLPWPKSSDT1SKNGTLSSVTSARALRPPHGPPRPGALTP 
TPSLSSQALP5 PRLPTTDGAUPOP I £ PI PGGVS'SS'GLSRMGAVP 
VMVPA0SOAGSLV 


S483 


3 


786 


FFFFXGCRAGRGNESDYRKLEEKHQRFI A/S ERSKDDLOLRLTRA 
ENRIK0LETDSSEElSRY0EKIQKLONVLE£ERENCGLVi>EORL 
KLQOENKOLRKETESLRKIALEAOKKAKVKISTHEHEFSJKF.RG 
FEV01>REMEDSNRNS3VELRHLLATQ0KAANRWKEETKKLTESA 
F1KTNNLKSELSRQKLHTQELLSOLEMAJ>JEKVAENEKLILEHOE 
KANR LORRLS OAEERAASASOOLS V I T VCR R KAA S LMNLEN I 


5484 




1997 


IMADMEDLFGSDADSEAERXDSDSGSD5DSDQENAASGSNASGS 
ESDODERGDSGOPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDF.DtGHRSDGGSHHSE 
AEGS E KAKSDDEKWGREDJCSDOS DDEK I ONSDDE ERAOGSDEDK 
LONSDDDEKMONTDDEERPOLSDnEROOLSEEEKANSDDERPVA 
SDNDlJEKQNSDDEEOPQLSDEEKMQNSDDERFOASDEEHRHSDD 
EEEODHKS ES ARGSPSEDEVLRKKR KNA I ASDS EADSDTEVPKD 
NSGTMDLFGGADDXSSGSDGEDKPPTPGQPVDENGLPOD00EEE 
PIPETR1EVE1PKVNTDLGNDLYFVKLPNFLSVEPRPFDPCYYE 
DEFEDEEMLDEEGRTRLKLKVENT3RWR1 RRDEEGNEI KESNAR 
I VKWS DGSI^S bHLGNEVFDVY KAPLOGD r WHLF 1 ROGTGLOGOA 
VFKTKLTFRPHSTDSATHRKMTI^LADJRCSKTOKIRILPMAGRD 
PECORTEMIKKEEERLRASIRRESOORRMRFKOHORGLSASYLE 
PDRYDEEEEGEESISLAAIKKRYKGG1REERAR1YSSDSDEGSE 
EDKAQRLLKAKKLTSDEVRPNLFNSRGLSCTOEPTALNEELTDQ 
AGTtf 


548B 


161 


107 4 


KRKILSSKxMDSEAHEKRPPILTSSKQDJSPHlTNVGEMKHYLCG 
CCAAFJWVA1TFPIOKVLFR0OLYG1KTRDA1LOLRRDGFRNLY 
RG I LP PLyiQKTTTLALMFGLYEDLSCLLK KHVS APE FATSGV AA 
VLAGTTEAlFTPLERVQTLLODHjamDKFTNTYOAFKALKCHGI 
GEYYRGLVP1LFRNGLSNVLFFGLRGPJKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINVVKTRIQSOIGGEFOSFPKVF0KI 
WLERDRKLINLFRGAHLNYHRSLI SWGI I NATYEFLLKV1 


54B6 


1404 


142 


1 PGSTa SWSFAAARGLSVCRCCRLHPASAJ^DLFGDLPEPERS PR 
PAAGKEAOKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSLATSISOMVKTEGKGAKRKTSEEEKNGSEELVEKKVCKASSV 
1 FGLKG Y VAER KGEREEMQDAHV I LNDI TEECR PPS S LI TRVSY 
FAVFDGKGGIRASKFAAO^LHONLJRKFPKGDVISVEKTVKRCL 
LDTFKHTDEE FZjKOASSQKPAW KDGS TATCVLA VDN I L Y J ANLG 
DSRAI LCR YKEESOKHAALSLS KEKNFTO YEERMRI OKAGGNVR 
DGRVLGVLEVSRSIGtJGQYKRCGVTSVFDI RRCOLTPNDRF1 LL 
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SBO j 
3D 
NO: 


Predicted 
beginning 
nucleotide 
iccation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firr-i 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine . G-Glycine, 
K-Histidlne, I = 1 soleuci ne , K=Lysine, 
I,=Leucirie, M-Metbionine , N=Asparagine , 
P^Proline, Q=Gju ta-nine , R^Arginine, 
S^Serine, T=Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X=Unxnowrj, *~Stcp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVNF1 LSCLEDKKl QTR EG KSAADAR YE AAC 
NRLANKAVQRGSADNVTVMWRIGK 


5487 


535 


3 62 | AVSL2QIRGLQTi J APVPLPbQPCPSNCDMERVTLALLLLAGLTA 
! LEANDPF7\NKDDPFVyDWKNL0LSGLICGGLLAIAGIAAVLSGK 
j CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5466 


1072 


255 


AI4AASGB PQR OW0E EVAAVWVGS CMTDLVS bTSRLPKTGETl H 

ghkffigfggkganqcvqa^rlcajVtswckvgkdsfgndyien 
lk0ndisteftyotkdaatgtasi i vnnfgoni ivivaganlll 
ntedlraaanvi srakvmvcqle3 tpatslealtmarrsgvktl 
ftipapa1adldp0fytlsdvfccneseae1ltgltvgsaadage 
aalvllkrgcowiltlgasgcwbsqtepepkhiptekvkavd 
•rrvsFKi 


5489 


8 i 




GKGPVAAFIDOSjNIFLTDPXIFLGOWREEPKMPLLLLGFTEPLK 
LERDCRSPVEPWAAASPDLALACLCHCQDLSSGAFPNRGVLGGV 

lfptvemvikvfvatssgsiairkkqqewgfleankidfkeld 

I AGDEDNRRWMRENVPGEKKPQNG1 PLPPQJ FNEEQYCGDFDSF 
FSAKEENI3YSFLGIAPPPDSKGSEKA3EGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEET2E3AMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


89? 


GKGPVAAFIDQSNI FLTDPKI FLGQWREEPKMPLLLLGETEPLK 
LERDCRSPVEPV7AAASPD1A1ACLCHCQDLSSGAFPNRGV1/GGV 
LFPTVEMVIKVFVATS55GS JA3RKK00EWGFLEAKKIDFKELD 
I AGDEDNRRWMRENVPGEKKPQNG 1 PLPPQI FNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAE5GGETEAQKEGSEDVG 
NLPEAQEKNEE EG ETATEE7EE 1 AMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


2194 


GSAPRLSLGPTGAQARDPDWWARPPSKPYTOSKEDRPDTEGRSE 
O^SDMASSFLPAGAI'IGDSGGELSSGDDSGEVEFPHSPEIEETSC 
LAELFEXAAAKLOGLI QVASREQLLYLY AH Y KQVKVGNCKT PX? 
SFFDFEGKQKWEAWKAU5DSSPS0AM0EYIAWKKLDPGWNP0I 
PEKKGKEANTGFGGPVISSLYHEETIREEDKN1FDYCRENN1DH 
I TKA3 KSKNVBVNVKDEEGRALLHv;ACDRGHKELVTVLliO>!RAD 
I NCQDNEGQTA1.H Y AS ACE FbD I VELLLOSG ADPTLR DQDGC LP 
EEVTGCKTVSLVLORHTTGKA 


5492 


3 


1 £96 


AS KNP LS A VCTTG 1 MSS LAVR DP AMDR SLR SV F VGN I ? YEAT EE 
QLKDI FSEVGS VVS FRLVYDRETG K PKG YGFCEYQDQETALS AK 
RKLNGREFSGRALRVDNAASEKNKEELKSLGPAAP1IDSPYGDP 
IDPEDAPESITRAVASLPPEOMFELNJKQMXLCVQNSHQEARNML 
LONPOLAYALbQAO WM RIMDPEI ALK I LHR K I HVTP L1PGKS0 
SVSVSGPGPGPGPGLC?GPNVl,LNOONPPAPOPQHIiARRPVKDl 
PPLMQTPIQGGI PAPGP 1PAAVPGAGPGSLTPGGAMQPQLGMPG 
VGPVPLERGOVOKSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTlXSVTGHVEFRGybGPFHOGPPMHHASGHDTRGPSSHEMRG 
GPLGDPRLLIGEFRGPM1DQRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGG1CGPGPIN1GAGGPPOGPR0VPGISGVGNPGAGMCX5TGJ0 
GTGMQGAG I QGGGMQG AG 1 QG VS I OGGG I QGGGI QGAS KQGGSO 
PSSFSPGQSQVTPODCEKAALIMOVLOLTADOrAMLPPEOROSl 
LILKEQIQKSTGAE 


54 93 


1 


187t 


RAPMMTKAVPEEPRKPGRLTOALN S P LTWEHVW I CVPGGTPDCL 
TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMyDEIIELKKSLH 
VOKSDVDLNRTKLRRLEEENSRKDROIEOLLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRI LKLE0OCKEKDGT1 S KLQTDMKTTNLE 
EMR I AMETYYEE VHR LQTLLAS S ETTG KKPLGEKKTGAKRQ K K/< 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRLLRRIVELEKKLSVWESSKSHAAEPVRSHPPACLASSSAI. 
HRQPRGDRmDHERLRGAVRDLKEERTALQEQLLQRDLEVKCLL 
OAKADLEKBLECAREGE EERRERE EVLRE B I QTLTSKLQELQEK 
KXEEKSDCPEVPHKAOELPAPTPSSRHCEODWPPDSSEEGLFRF 
RS PCS DGRRDAAAR VLOAQWKVY KHK KKKAVLDEAAVVLOAA FR 
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Amino &cid seatnenl containing signal peptide 
!A=Alanine, C^Cysteine, D-^Aspart ic Acid, E- 
Glutamic Acid, F=Pnenyl alanine , G-Glycine, 
H=Histid;ne, 1 = 1 soleucme, X=Lysine, 
L=Leucine, M=Methi oni ne , N=Asparagi ne , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threomne , V-Valinc, 
V?=Tryptophan, Y^Tyrcsine , X=Unknowr., *=Stcp 
Codon, possible nucleotide deletion, 
\=possible nucleotide insertion) 








G HLTRTKLMS KAHGS E PPS VPG 1 » PDCS S P V PR V PS P 1 AOATGS 
PVQEEAI vnQSALRAHLARARHSATGXRTTTAA^TRRRSASAT 
HGDASSPPFLAALPDPSPSGPOAVAPLPGDDVNSrDSDDIVlAP 
SLPTKNFPV 


54 94 


71 


5.1'. 


RSKAKIGTPTREVPSTDKKVRRESSSSLTHRPAPSPATPRIjLGT 
RRVLIjGVS EGTGCADAMEI.VLV F LCSLLAPMVLASAAEXEKEKD 
PFHYDYQTLRlGGLVFAVVLFSVGILLILSRRCKCSFNQKPRAP 
GDEEAQVENLITANATEPOKAEN 


5195 


273 


2 3 € * 


DSLLLI0VDTMPFTLHLRSRLPSAIRSLIL0KKPN1RNTSSMAG 
ELRPASLWLPRSLAPAFERFWWGPLPLLGOSEPEKWMLPF 
0GAISETRMGHP0FWKY£FGACTGSLASLE0YSEC?LKDMVAFF1i 
GCSFSLEEALEKAGLPRRDPAGHSOAGAYKTTVPCVTHAGFCCP 
bVVTMRPl PKDKLEGLVRACCSLGGEQGQPVHMGDPELLGI KEL 
S K PAYGDAMVCPPGEVPV FWPS PLTSLGAVSS CETPLA FAS 1 PG 
CTVWTDLKDAKAPPGCLTPER1PEVHH1SQDPLHYSIASVSAS0 
K I R E1.ESM1 G 1 DPGNRG 1 GHLLCKDELLKASLSL3HARS VLI TT 
GFPTHFNHEPPEEI'DGPPGAVALVAFLQALEKEVAIJVDORAW 
LKOXIVEDAVEOGVLKTOlPILTYOGGSVEAAQAFLCKNGDPQT 
PRFDHLVAI ERAGRAADGNY YNARKMNIKliLVDP I DDLFLAAXK 
I FG ISSTGVGDGGNELGMGKVKEAVRRHI RHGDV I ACDVEADFA 
V I AG V SNWGG Y ALACALY 1 LYSCAVHSQY LR XA VG PS RAPGDQA 
HTQAL.PSV1 KEEKMLGI liVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMIQKLVDVTTAQV 


£>4 96 


3 




QDTKMHEI YKGNITPQLNKNTL.KTSAATDVWAVY FSQFW I DYEG 
rtKSGKGRPlSPVDSFPLSlWICQPTRYAESQKEPOTCNOVSLNT 
SOSBSSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVrlVHKHVSMOlNHYOYLbLLFLHESLll.LSE 
NLRKDVEAVTGSPASOTS1CIGILLRSAELALLLHPVDQANTLK 
SPVSESVSPWPDYLPTENGDFI^SSKRXQISRDIWRIRSVTVNH 
MSDKRSMS VDLSHI PLKDPLLFXSASDTNLQKGI S FMDYLSDKH 
bGK!SEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVl.NYRS 
DSNILSFDSDGNQNILSSTLTSKGNETI ESI FKAEDLLPEAASL 
SENLDlSKEETPPVRTLKSQSSL/SGKPKERCPPNIiAPLCVSYXN 
MKRSSS0MSLDTISLDSM1LEE0LLESDGSDSHMFLEKGNKKNS 
TTNYRGTAESVh'AGANLQNYGETSPDAISTNSEGAOENHDDLMS 
WVFK3TGWGEID1 RGED7E1 CLCVNQVTPDQLGNI SLRHYLC 
m PVGSDQXAV I HS KSS PE 1 SLR FESG PGAV I HS LLAEKNGFLQ 
CHIKNFSTEFLTSSLMNIOHFLEDETVATVMPWXigVSNTKINJL 
KDDSPRSSTVSLEPAPVTVHIDHLWERSDDGSFHIRDSHMLNT 
GNDLXENVKSDSVLLTSGXYDLKKORSVTQATQTSPGVPWPSQS 
ANFPBFSFDFTREOLHEFJ^SLXOEliAKAKMALAEAHLEKDALIj 
HHIXKMTVE 


5497 


1821 


i 330t 


S 1 SKLliKRRSN 1 DAYLLSNS CAF F APR LFSbASQ 1 1 REQQS PNV 
CFI YKYSG FPS LEOQCHFVS PHSSCYIN FFS FP P P FFVCFQLSN 
GFSHY SLS S E SHVG PTGAGLF PHCLP ASR LLPR VTS VH L PD YAH 
YYTIGPGMFPSSQIPSWXJDKAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAORPRSMTVSAATRPGEEMEACEELA 
LALSRGLQLDTQRSSRDSLCCSSGYSTQTTTPCCSEDTIPSOVS 
DYPYFSVSGDOEADQOEFDXSSTI PRNS DI SOS YRRWFQAKRPA 
ETAGLPTTLGPAKVTPGVATI RRTPSTX PS VRRGT3 G AG PI PI K 
TPVI PVKTPTV PDLPGVLPAPPDGPEERGEHS PES PSVGEGPOG 

EP PSATVS PGQ I PESDPADLS PRDTPQGEDMI>NAI RRG VKLKKT 
TTNDRSAPRFS 


S498 


2434 


14SS 


1 LTHQEI FTGEXPCECGKAS 1 OMSHLSQQKI YSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKPFECNECGKAFSOKOYVI KHQNTH 
TGEXLFECNECG KS FSOXENLLTHOKIKTGEKPFECXDCGXAFI 
QKSNLIRHORTHTGEXPFVCXECGXTFSGKSNLTEHEKIHIGEK 
PFXCSECGTAFGQXKYLIKHONIKTGEKPYECNECGKAFSQRTS 
L3 VHVR1HSGDKPYECNVCGKAFS0S SSLTVHVRSHTGEKPYGC 
KECGKAFSQFSTLALKLRIHTGKXPYQCS ECGKAFSQKSHH I RH 
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Amino acid segment containing signal peptjne 
(A=Alanine, C=Cyf-teine, D=Aspartic Acid, 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
HsHistidme, I =lsoieucine , K-Lyeine, 
L»= Leucine, t^Methionine , N=Acparagine , 
p*Proline, Q=Glutanune , R=Arginine, 
SxSerinc, 7- Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Ccdon, /=possifcle nucleotide deletion, 
\=possible nucleotide insertion) 








OKI HTH 


5499 


324 




GFGQI GRGHK1 TTYPFSPRKSGRKGMA0SQGWVKRV1 KAFCKGF 
FVAVPVAVTFLDRVACVARVEGASMQ?SLNPGGSQSSD\nrLL.MH 
WKVRNPEVHRGD1VSLVSPKNPE0KIIKRVIALEGDIVRTIGHK 
NR YVKVPRGH J WVEGDHHGHSFDSNSFGPVSLGLLKAHATH I LW 
PPERWQKLESVLPPERLPVQREEE 


S500 


197B 


1286 


KPDWRLQNbFPRLYLWRSSRFGFGHLKKRLOMDPKlEKTWDGPP 
VKHEPVFIRLNPGDRGVKMDISAPFFRDPPAPLGEPGKPFNELW 
DYE WEAFFLND I TEQYLEVELCPHG0HLVLLLSGRR1WWKQEL 
PLSPRVSRGETKWEGKAYLPWSYFPPNVTKFNSFA1HGSKDKRS 
YEALYPVPOHELOOGOKPDFHCLEYFKSFNFNTLLGEEWKQPSS 
DLWLIEKCD3 


5501 

i 

i 


2927 

1— , — 


222fc 


CRPPVSARVAPGHOGAVGGSGRRPARVEWDAAARPSSRPFSLP 
AAI WLAL I SRLLD W FR S L FWKEEWELTLVGLQ YS6 KTTFVNVI A 
SGOFSEDMIPTVGFNMRKVTXGNVTIKIWDIGG0PRFRSMWERY 
CRGVNAI VYMI DAADREK1 EASRKELHNLLDKPQLCG 1 PVLVLG 
NKRDLPNALDEKQLI EKMNLSAI QDRE I CCYS I S CKEKDWI D I T 
LQWLI QHSKSRRS 


5502 




824 


NS AF P VW VPERTALbTCP LGAA PGS S REAPG 1 AG P PNSTAMS KL 
GKFFKGGGSSKSRAA PSPOEALVRLRETEEKLGKKOEYLENR 1 0 
REIALAKKHGTONKRAAL0ALKRKKRFEKQLTQIDGTLSTIEF0 
REALENSHTNTE VLRNMG FAAKAMKS VHENMDLNKI DDLMQE I T 
EQQDIAQEISFAFSQRVGFGDDFDEDELMAELEELEQEELNKKM 
TNIRLPNVPSSSLPAOPNRKPGMSSTARRSRAASSORAEEEDDD 
IKQLAAWAT 


5503 


216 


6 54 


KGVRRRGRVRSDSEDSHLiGYFKMSFLLPKLTSKKEVDQAIKSTA 
EKVLVLRPGRDEDPVCLCLDDI LSKTSS0LSKMAAI YLVDVDQT 
AVYTCYFDISYI PSTVFFFNGOHMKVDYGGEDPALRSI KAVRRT 
SPAGTLGEKPVNS 


S504 


58 


3SG3 


OLSFSFQAPVTFDDJTVYLLOEEWVLLSQOQKELCGSNKJbVAPli 
GPTVAN PELFR K FGR GP EPWLGS VCGQRSLLEHHPGKKQMGYWG 
EMSVOGPTRESGQSLPPOKKAYI^SHLSTGSGHIEGDWAGRNRKL 
LKPRS I QKS WFVQ fP WL 1 MNEEQTALFCSACREY PS 1 RDKRSRL 
I EG YTG P FKVETLK YJ1A KS KAR MFCVNALAARDPI WAARFRSI R 
DPPGDVLASPEPLFTADCPl FY PPGPLGGFDSMAELLPSSRAEL 
EDPGGDCA I PAM YLDC1 S DLRQKEI TDGIHSSSD INI L YNDAVE 
SClODPSAEGbSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 
QKELYR DVMRMNY ELLAS LG PAAAKPDLI S KLERRAAPWI KDPN 
GPXWGKGRPPGNKKMVAVREAPTOASAADSALLPGSPVEARASC 
CSSS I CEEGDGPRR I KRTYRPRSI QRSWFGQFPWLVI DPKETKL 
FCSACIERPNLHDXSSRLVRGYTGPFKVETLKYHBVSKAHRLCV 
NTVE I KEDTPHTALVPE I SSDLMANMEHFFNAAYS I AYHSRPLN 
DFEKILQLLQSTGTVI LGKYRtfRTACTQFI KYISETLKRE1LBD 
VRNS PCVSVLLDSSTDAS ECACVG I YIR YFKQMEVKES YI TLAP 
LYSETADGYFETIVSALDELD1 P FRKPGWWGLGTDGS AMLSCR 
GGLVEKFQEVI PQLLPVHCVAHR LHLAVYDACGS IDLV KKCDRH 
IRTVFKFYOSSNKRLKELQEGAAPLEQE1IRLKDLNAVRWVASR 
RRTLHALLVSWPALARHLORVAEAGGQIGHRAKGKLKLMRGFHF 
VXFCHFLLDFLSI YRPLSEVCQKEI VL1 TEVNATLGRAYVALES 
LRHOAGPKEEEFNASFKDGRLHGICLDKLEVAEORPOADRERTV 
LTGIEYLOQRFDADRP POLKNMEYFDTMAWPSGIELASFGNDD1 
LN1ARYFECSLP1X3YSEEALLEEWLGLKTIA0HLPPSMLCKNAL 
AOHCRFPLLSKLMAVWCVPl STSCCERGFKAMNRI RTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPA1QHWY LTSSGRR FS H VYT 
CAQVPARS PASARLRKEEMGALYVEEPRTQKP P I LP SRE AAEVL 
KDCIMEPPERLLYPHTSOEAPGMS 


5506 


3312 


1219 


NCSPRSL5AAWA3lTOlWNKJbPSNLPQLCKLlKRDPPAY 1 EEFLQ 
QYNH YXSNVE I FKLOPNKPSKELAELVMFMAQISHCYPE YLSNF 
PQEV KDLLS CNHTVLD PDLRMT FCKALILLRNKNLINP S SLLEL 
FPELFRCHDKLL^KTLYTHIVTDIKMIIWKIIKI^KVNVVIjONFK J 
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5506 



5507 



5508 



5509 



153: 



3704 



127: 



1151 



123 * 



T9T 



619 



Amino acid seument containing signal peptide 
(A=Al£nine, C^Cysteine, D^Aspartic Acid, E^ 
Glutamic Acid, F=Phenylalar.ine , G=Glyc:.r i c , 
H-Histidine, 1 -I soieucine, K=Lysine, 
L= Leucine, M=Methion3ne , N-Asparagine , 
j P«=Proline, 0=Glu ta-m ne , R-Arginine, 
S- Serine, T= Threonine , V= Valine, 
W=Tryptophan, Y~Tytc£ine, X-Unknown, *~stcp 
Codon, /^possible nucleotide deletion, 
\=possib}e nucleotide insertion) 



VTMLR DSN ATAAKMS LDVM3 ELY RRNI WNDAKTVNV I TTACFS K 
VTKIUVAALTFFLGKDEDEKCDSDSESEDDGPTARDLLVOYATG 
KKSSKNKKKLEKAKKVLKKHRKKXKPEVFNFSA1KL1HDP0DFA 
EKLLFCQLECCKERFEVKHMLMNLtl SRLVGIHELFXFNFY PFLQR 
FL-0PB0REV7K J LLFAAOASHHLVPPEI IO/SLLMTVANWFVTDK 
NSGEVKT VGINA1 KE1 T ARCP LJU4TE ELLQDL*AQY XTH JOKNVM 
KSARTLIHLFRTLNPOMLOKKFRGKPTEASIEARVQEYGEl,DAK 
DY1 PGAEVLF-VEKEENAEKDEDGWESrSLSEEEDADGEVn DVQH 
ESDEEQQE 1 SKKLNSMPMEERKAKAAA1STSRVLTQEDFQKI KM 
AOMRKELDAAPGKSOKRKYIEIDSDEEPRGELLSLRDIERLiHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTWPFSSSTNKEKKKOK 
NFMWRySONVRSK^KRSFREKQLALRDALLKKKKRMK 



fkgdbcgokggsapgeggssawpapa^flperererealcpgrs 
cscgggeetpgttpwsplegggdeelrpnpyvrfpyrwkavvv 
laafpsigaggetpeappeswtolwffrpwnaagyasfr^pgy 
ll vq y frr kny l.etg r glc fpl»v kac v fgn e p kasde v p laprt 
eaaettpmw0alkllfcatglovsyltwgvl0ervmtrsygata 

TSPGERFTDSOFLVLMNR^/IjALI VAG1>SCV1»CKQPRHGAPKYRY 

sf aslsnv lus s w co yealkfv s fptqvlakas kv 1 pvmlmg klv 
srrsyehweyi>tatlis:gvsmfllssgpeprsspattlsglil 
lagyiafdsftsnwodalfaykmssvcmnfgvnffsclftvgsl 
leqgallegtrfmgrhsefaahallls i csacgql.fi fytigqf 

GAAVFTI I MTLRQAFA1 LLS CLL YG H T VTWGGLG VA W FAALL 
LR VYARGR LKQRGK KAV P VESP VQKV 



PRGTRRCR FAGRAS KRARRR PPCPGPAAPGSLE I GGFGTAAGKK 
VAVAD VQFG PMR FH0DOL0VLLVFTKEDNQCNGFCRACE K AG FK 
CTVTKEAOAVIiACFLDKHHDI 1 1 JDHRNPRQLDAEALCRSI RSS 
KI.SENTVI VGWRRVDREELSVMPFHSAGFTRRYVENPN1 MACY 
NELLQLEFGEVR SQUOLRAOJS VFTALENSEDA I EI TSEDR F I Q 
YAN PAFEriMG Y OS GEL1 GKELGEVP I NEKKABLLDTl NS C I RI 
GKEWQG I YYAK K KNGDN I QQNVK jjpvi GQGGK I RHYVS 1 1 RVC 
NGNNKAEK 1 SB CVOSDTHTDNOTGKHKDRRKGSLDVKAVASRAT 
EVSSQR RH SSMAR I HSMT 1EAPI TKV I N II NAAQESS PMPVTEA 
LDRVLE I LRTTELYSPOFGAKDDDPHANDLVGGLMSDGLRRLSG 
NEYVLSTKNTQMVSSNI TTP1 SLDDVPPRIARAMENEEYWDFDI 
FELEAATHNR PL>I Y LGLKM FARFG I CEFLHCSE S TLRSKLQ HE 
ANVHS SNPYHKS TH S ADVLHATA YFLSKER I KETLDPI DE VAAL. 
I AATI HDVDHPGRTNSFLCNAGSELAI LYNDTAVLESHHA/.LAF 
OLTTGDPKCNI FKNMERND YRTLRQGI I DMVLATEMTKHFEHVN 
KFVKS 1NKPLATLEENGETDKNQEVINTMLRTPENRTLI KRML-I 
KCADVSNPCRPL.OY CI EWAAR I SEEY FSQTDEE K0QGL.P WM PV 
FDRNTCS I PKSO ISFI DYFI TDMFDAWDAFVDLPDLMQHLDNNF 
XYWKGL.DEMXLRNLR PPPE 



"LS S VFS RRS ASM FAVGCSMGP FLHYWYLS LDRLFPASGLRG F PN 
VLKKVLVDQLVASPLUTWYFLGLGCLEGOTVGESCOELREK FW 
EFYKADWCVWPAAOFVNFLFVP P0FR VTYINGLTLGWDTY LS YL, 
KYRSPVPLTPPGCVALDTRAD 

RKSRGC0N ALSASG P AAAAAA 1 MVRKLK FHEQKLLKQVDF LN VI E 
VTDHNLHELRVLRRYRL^RREDYTRYNQLSRAVREIARRXRDLP 
ERDQFRVRASAAiLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTOLKLRMAQHLOAAVAFVEQGHVRVGPDWTDPAFLVIRSM 
EDFVTWVDSSKI KRHVL.EYNEERDDFD1XA 



PAGAHL.SSGSSEPLVEPGRGRVGARVKGERGLOASGSAPGRSKM 
AEGERQPPPDSSEEAPPATQNFIIPKKE1HTVPDMGKWKRS0AY 
ADY1GF I LJ-LNEGVKGKKLTFEYRVSEAIEKLVALLNTLDRW ID 
ETPPWOPSRFGNKAYRTWYAKLDEE^ENLVATWPTMLAAAVP 
EVAVYLKES VGNSTR I D YGTGHEAAFAAFLCCLCKI G VLR VDDQ 
IA1VFKVFTOYLEVMRKLOKTYRMEPAGSQGVWGLDDFOFLFFI 
WGSSQL IDH P YLE PRJHF VDEKAVNENHKDYMFLECI LFI TEM KT 
GP FAEK SNQ LiWN 3 S A VP S WS X VNOGL I RMYKAECLE K FP V 1 CH F 
XFGSLL.P1HPVTSG 



5510 



96 



119S 
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Glutanic Acid, F^Phenylalanine, G^Glycme, 
H=Hi?t idint, 1=; Isoleucine, K=Lysir;e, 
Lc-Leucine, K= Methionine, N^Asparoomt, 
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S = Senne, ^Threonine, V^Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=5top 
Codor., /=possible nucleotide deletion. 
\*possible nucleotide insertion) 


5511 


276 


158C 


KLSRVLNI.PPENHTSISAVPISOKEEVADFQLSVDSLLEKDND 
KSRPDJOVOAKRbAEKLBCDTWSEISTGORTVKFKINRELLTK 
TVL0OV1 KDGSKYGLKSELFSGbPQKKI VVEFSS PKVAKKKHVG 
HLRSTI I GNFI^LKEALGHQVIRINYLGDWG«0FX3LLG7GFQL 
FGYEEKLOSNPLQKLFEVYVOVWKEAADDXSVAKAAOEFFORLE 
LGDV0AL5LWOKFRDLSIEEYIRVYKRLGVYFDEYSGESFYREK 
S0EVLKLLESKGLLLKTIKGTAWDLSGNGDPSSIC7Vf4RCDGT 
SLYATRDlAAAlDRKDKYWFDTMIYVTDKGOKIQMFTOVFOf-ILKI 
MGYOWAER CO HVP FGWQGMKTRRG0VTFLEDVLNE1 OLRKLQN 
MASIKTTKELKNPQETAERVGLAALIIQDFKGLLLSDYKFSWDR 
VFOSRCDTGVFLOVTHARLKSLEETFC-CGYLNDFNTACLOEPQS 
VSILOHLLRFPEVI^YKSSODFOPRHJVSYLLTI^SHLAAVAHXTL 
01 KDSFPEVAG?\KLKL»FKAVRSVLANGMKLLiGlTPVCRi4 


5522 


120 


101b 


DPSLLlTITVrGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
1EKEKISIWTRRFRFGLPSPDHVLGLPVGNYV0LLAKIDNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPOYPEGGKMTOYLENMK 

GGTGI TPMLQL3 RK 3 TKDPSDRTRMSLI FANQTEEDI LVRKELE 
EIARTHPDQFDLWY7LDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLI LVCGPPPMOTAAHPNLEKLGYTGDMI FTY 


5513 




637 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPFPEPSSFPS 
PPTSRGGPGSRD'rKSDSEEESODRQLKlWLGDGASGKTSLTTC 
FAOETFGKQYKQTlGLDFFLRRlTLPGr^LNVTLQIWDlGGOTIG 

(~ i/ui ni/vivri/VUT T WnTTVVACPPMI.rTMJYTWVKVcrCCp 
L-KMJLiJJKYJ. i OAyoVLijV I JJJL i. I\ jyorc.riJjc.Uw J l vvaa.v^. rxjC 

TOPLVALVGNKIDLEHMRTIKPEKHbRFCOENGFSSHFVSAKTG 

DS V FLCFOK VAAEI LG I KbNKAE I EQSOK WKAD 1 VN YNQ EPMS 

RTVNPPRSSMCAVC- 


5514 


125f- 




VNRPSWJ KGNFRGHALPGTFFFI JGLWWCTKS J IjKYICKKOKRT 
C YLGS KTL F YR^E I LEG! T I VGMALTGMAGEQ FI PCCPK LMLYD 
Y KC^KWNQLLGWHH FTMYFFFGLLGVADILCFTI SSLP VS l/TKD 
MLSNALFVEAFI FYNHTHGREMLDI FVHOLLVLWFLTGLVAFb 
EFLVRNN^LEliLRSSLIbIiC<5SWFF0IGFVLYPPSGGPAKDLM 
DHENILFLT1CFCWHYAVTIVIVGMNYAFITWLVKSRLXR1.CSS 

C V vs Lt la In, <i r\C Iv Bi V d «3 C O O* i 


5515 


1572 


260 


FVRLVGRGDCDPLLSVCLTTMPLYEGrXJSGGEKTAWlDLGEAF 
TKCGFAGETGPRCI 1 PSVIKRAGMPKPVRWQYN1NTEELYSYL 
KEF1H3 LYFRHLbVNPRDRRWI iESVLCPSVlFRETbTRVLFKY 
FEVPS V LI^APSHLKALLTLGI NSAMVhDCG YR ES l,VLPl Y EG I P 
VLNCWG ALFLGGKAI-HKEbETQj^EQCHVDTS VA K EOSLP S VMG 
SVPEGVLEDI KARTCFVSDLKRGhKlQAPiKFNl DGNNERPSPPP 
NVDY PU>GEK1 I LGS 3 RDS WEI LFEQDNEEQSVATL J LDSl> 
I OCP I DTR KCLAENL.W3 GGTSMLPG FLHRLLAE I RYLVE KP KY 
K KALGTKTFR IHTP PAKANCVAWDGGAI FGALQD I LGSRS VS KE 
YYNQTGR I PDWCSLNNPPLEMMFDVGKTOPPl^KRAFSTEX 


5S16 




735 


KSREP POAGPGPS PR KS PTASSFI>FP WR P F WMGACK5A0ES 
I KAMWR VPGTTRRP VTGES PGMHRPEAML.LULTLALLGGFTWAG 
KMYGPGGGKYFSTTEDYDHE1TGLRVSVGLLLVKSV0VKXGDSW 
DVKLG ALGGNTQEVT LQFGEY ITKVFVAFQAFLRGMVMYTS KDR 
Y F Y FG K bDGO I S S AY PSQEGQ VI»VGI YGQ Y 1 KS I G FE KNY 
PLEEPTTEPFVNLTY SANSPVGR 


5517 




4 99 


S EI YVAMRTDSS KMTDVESGVANFASSARAGRR NALPDI QS S AA 
TrX5TSDLPLKl^Al^V)G5:nAKEXT3EKTTODOLE7<PON£EK 


ssia 


3 


1375 


DAHADAVJm^>OIJWDFPCI.WbGLLLPLVAALDFNYHRQEGMEA 
FLKTVACNYSSVTHLHS3GKSVKGRNLHVLWGRFPKEHRIGIP 
EFKYVANT^HGDETVGRELIAHLIDYLVTSDGXJDPE1TK1.1NSTR 
IHIMPSKNPPGFEAVKKPIKYYSIGRENYNQYDLNRNFPDAFEY 
Nl JVS RO PE 7 VAVM KWLKTETFVXS ANLHGGALVAS VP FDNG VQA 
TGAL. Y S R S LTPDDD V FQ Y LAHT Y ASRNPNMKXGDECKN KMNF PN 
G VTNGYS KY PLQGGMQD YNY I WAQCFEI TLE LSCCKY PR E EKLP 
S FWNNNKASLIEY I KQVHiX-VKGQVFDONGNPLPmn VEVQDRK 
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SHO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ior. 
corresponding 
Lo first 
amino onic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
utnino acid 
residue oi 
amino acid 
sequence 


Amnc acid ceo merit containing s a <?na T" pept i de~ — 
(A=Aianme., C=Cy£teanc, D=P.spar:ic Acic, K = 
Glutamic Acic, r -Phenylalanine, G=Giyc:ri£, 
K=Kistidine, 1 = l soleucine, K=Lysin€ , 
L-Leucine, I'.r Methionine, N^Asparsgi ne , 
PcProhne, C-G'i ut amine , R=Arginine, 
S^Scrme, T=Tmeonine y V=Valine, 
fo'a Tryptophan, Y = Tyrcsine, lUUnknowr. , *^Stop 
Corion, /=possibie nucleotide deletion, 
\=possible nuciectide insertion) 








H:C?YRTKKYGEYYbbM>i>GSYIlKVi-VPGHDyHlTKVl IPEXS" 
ON r SAL K KDI L» I . F FQGQ I jDS 1 P V S N P S CPM I PL YRN L P DH S AAT 
KPSLFLFLVSLLHI FFK 


5519 


8'/ 


477 


1 KSKl^G^VEVCESLWRLTEAXGPTMGKESGWDSGRAAVAAVVG 
GWAVGTVLVALS AMGFTSVG1AASS I AAKMNSTAAl ANGGGVA 
ACSLVAI LOS VGAAGLS VTSKVI GGFAGTALG^H. GSPPSS 


5520 


11'/ 


T^r^ 

943 


PTLGRQKVLKTFTVPRSAlAMTKTSTClYHP'LVLSKyTFLMYYI 
SQFX2KDEVKPX j LANGARWKYMTLLNLLLQ7J FYGVTCLDDVLK 
RT KGGKD1 KFLTA F RDLLFTTLAFPVSTFVFLAFW I LF LYNRDL 
IYPKVLDTVIPVWLNHA/-5HTF1FPITLAEVVLRPHSYPSKKTGL 
TLLAAAS 1 AY I SRI LWLYFETGTWVY PVFAKLSLLGLAAFFSLS 
YVFIASIYLLGEKLNHWKWVSVQILQ^WRLESVGICFOWPDWKS 
PAKHQLVXNIR 


5521 


54* 


911 


KlLNMOKSCKENtr-KPCMMPKAEEDRPLFDVPOEAEGNPOPSEE 
GVSOEAEGNPKGGFNQPGOGFKEDTPVRHLPPEH^JRGVDELER 
LREEIRRVRNKFVKKKWKQRHSRSRPYPVCFRP 


5522 


1224 


637 


GSRPbGCRSREKMKVFGYGSLJWKVDFPYODKLVGYITNYSRRF 
WQGSTDHKGVPGKPGFA'VTbVEDPAGCVWGVAYRbPVGKEEEVK 
AYI'DFREKGGYJ\T77VIFYPKDPT7K?FSVLLYIGTCDNPDYL£ 
PAPLEDI AEOI FN AAG PSGRNTEYLFELANS I RNLV Pb'K ADEHL 
FALE KLVKERLEGK OK b»CI 


5523 


3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFOILRAIGKGSFG 
KV C 1 VQKR DTE XM Y AW K YUN KOQC I ER DEVRNV ? RELE 3 UQE I E 
HVFLVNLWYSFODErDMFMWDLLLGGDLRYHLQONrVOFSEDrV 
RLYI CEMALA1.DY LRGQHI IHRDVXPD^LLDERGKAHLTDFNI 
ATI J KDGERATALSGTKPYMAPEI FHSFVNGGTGYSFEVDWWSV 
GVMAYELLFGWRPYDlHSSNAVESLVQbFSTVSVQYVPTWSKEM 
VALLRKLLTVNPKHKLS.S'LQDVgAAPALAGVLWDHLSEKRVEPG 
FVPNKGRLKCDFTFELEEMII.ESRPLHKKKKR1AKNKSKDNSRD 
SS0SENDYLQDCI.DAIQQDFV3 FNREKLKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGP1CPSAGSG 


5 52 A ~ 


Si 


2319 


RERERDHRPGESSOGOSGAGGCFPSPTMELRCGGLbFSSRFDSG 

hbahvekveslesdgegvgggasaltsglasspdyefnvwtrpd 
caetefengnrswfyfsvrggmpgklixinimnmnkosklysog 
mapfvrtlptrfrv:erirdrptfemtetofvlsf^rfvegrga 
ttffafcypfsysdcqeblnoldorfpenhpthsspldtlyyhr 

ELLCYSLDGLRVDLLTITSCHGLREDRFPRI.EQ1.FPDTSTPRPF 
RFAGKR I FFLS SR VHPGETPS S FVFNG FLDFI LR FDDPRAQTLR 
RLFVF KLI F MLN PDGWRGH YRTDSRGVNLNRQ Y LK PDAVLH ?A 
1 YGAKAVLiLYHIT/H S R LNSQS SSEHQPSS CLP PDA? VSDbE KAN 
NLCNEAQCGKSADR PiNTVEAWKQTEPAEQKLNSVWl KPQQSAGLE 
E5APDTI PPKESGVAY YVDLHGHASKRGCFMYGNS FSDESTQVE 
NMLYPXLI SLN S AH FDFOGCNFSEKNMYARDRRDGQSKEGSGRV 
A3 YKASGI iHSyTLECNYWTGRSVNSl PAACHDNGRASPFPPPA 
FPSRY TVBLFEQVGRAMA 1 AALDMAE CNPW PRI VLS EHSSLTNL 
RAKML KHVRWS RGLSSTLNVG VNK KRGLRTPPKSHI^ GLP VS CSE 
NTLSRARSFSTGTSAGGSSSSOONSPQKKNSPSFPFHGSRPAGL 
PGLGS STQKVTHR VLG PVFGKPVWEP LOHVFGCLGH CWGK 


5525 


105 


834 


SNTLDFERHLFIKGQ01SDQTQLVINKLPEKVAKHVTLVRESGS 
LTYE E FLGRVAELHDVTAKVASGQEKKLL FEVQPC SDSS AFW KV 

VVKVVk.1 IN. J Obi VUunirmbl W " — \/Lt X FsJJJl 1 OVArtU V XtfVJ 

SSTSEEPDENSSSVTSCOASLWMGRVKQLTDEEECCICMDGRAD 
LI LPCAHSFCOKCI DKWSDRHRNCPICRLQMTGANES WVSDAP 
TEDDMANY I LNMADEAGOPHRP 


5526 


3 


853 


RRPCNPVRAAKJR7GAAARAPRGLEVTMLR VAWR TLS LI RTRAVT 
QVLVPGLPGGGSAKFFFNQWGLQPRSLLLQAARGyWRKPAOSR 
LDDDPPPSTLLKDYQhA/PGIEKVDDVVKRLLSLEMANKKEMLKI 
X0ECFMKKIVANPEDTRSLEARI1ALSVK1RSYEBHLEKJ1RKDK 
AH KRYLLMSIDQRKKJ4LKNLRNTNYDVFEK I CWGLGI EYTFPPL 
YYRRAjiRRFVTKKALci RVPQETOKLKKRRRALKAAAAAGKOAX 
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SEQ 
ID 
NO: 


"Preafc'tec 
becrinninc 
nucleotide 
locction 
corresponding 
to firs: 
amino acio 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue ot 
amino acid 
sequence 


I Amino <:cio segment containing signal peptide 
(A-.Uaninc, C^Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F= Phenyl 3 j snme , G=Glycine, 
Hv-HiEtidme, l = Isoieucine , K^Lysint, 
L-l.rucine, M=Methionine , N^Asparagi ne , 
P=Froline. Q=G3 utamine , R^Argi nine , 
S=:Serir.e, T=Threonine, V=Valine, 
Ws Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion: 








RRNPDSPAKAIPKTLXDSQ 


552 7 


322b 


565 


LbR KY bLHQN ?bb! .RHQPNRTCI S ?S ATMK bKDTKSR PXQSSCG 
KFCTKGlKWGKWKEVKIDFNNFAIXSCMDDLVCFEELTnYObVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 
KNV A TEG TCTQKE FE V KDPELEAQGDDK VCDD ? EAG EMTS EN L V 
0TAPKKKKNK6KKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 
AMKDLFVPRPVLRALSFLGFSAPTPlOALTIiAPAIRDKLDILGA 
AETGSGXTLAFAIFMIHAVLQWQKRNAAPPPSNTEAPPGETRTE 
AG A ETR S PG K AE AESDALPCDTV 1 ES EALPSD1 AAE ARAXTGGT 
VSDOALLFGDDDW5EGPSSL3REKPVPKQNEHEEENLDKEQTGN 
LKOFLDDKSATCKAYPKRPLLGLVLTPTRELAVQVK0HIDAVAR 
FTG I KTA J h VGGMSTQXQQRMLNRRPEI WATPGRLWELI KEKH 
YHbRNLROLRCLWDEADRMVEKGHFAELSOLLEMLNDSOYNPK 
KOTLVFSATL.TLVH0APAR1LHKKHTKKMDKTAKLDLLMOKIGM 
RGKPKVI DLTRNEATVETLTETK IKCETDEKDFYLY Y FLMQYPG 
RSbVFANE] SCI KR bS GLLKVLD I MP I /TbH A CMHQKQR LRNLEQ 
FAR LEDCVLbATDVAARGLDI PKVQHV1 KYOVP^TS EI YVHRSG 
RTARATN EG b£ LMU G PEDVI N FK K 1 YKTLKKJDED1 PLFPVQTK 
YKDWK.ER1 RLARQI EKSEYRNFOACbHNSWI EQAAAAbEI EbE 
FDMYXGCKADWEERRRQXQMKVLKKELRHLLSQPbFTESQKTK 
YPTCSGXPPHjVSAPSKSESAI>SCbSK0KKKKTKKPKEPQPEOP 
GPSTSAN 


5528 


'* 


895 


GPFLSACRMWGACKVKVHDSLATISITLRRYLRLGATMAKSKFE 
Y VR DPEADDTCI AHCW WVRLDGRNFHR FAEKHNFAKPNDS RAL 
CLKT K CA0T\'MEELED 1 V I AYGOSDF YSFVFKRKTNV; F KRRASK 
FKTHVASOFASS YVFY WRDYFEDQPLbY PPG FDGRVW Y PSNCT 
LKDYLSWOADCKlNNbYNTVFWALIOQSGLTPVOAOGRLQGTL 
AADKNDILFSEFNINYNNEPPNSYRKGTVLIWOKVDEVMTKEIKI, 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWb 


5529 


4 6 


640 


TFRbVS/^HLKTRKLINPEAAERRWRDWDSRQGWLSVKMORVSGL 
I :S KTLS R VLW LSGLSEPGAAROPR I MEEXAbE VYDU RTIR DPE 
KPKTLEELEWSESCVEVOElNEEEYbVllRFTPTVPHCSbATL 
1 GLCLRVKLORCLPFKHKbEI Y I SEGTHSTEED INKQ1 NDKERV 
AAAME* PNLR EI VZQCVLEPD 


5530 




2606 


A0I VHAI SYCHKbirVGHRDL»KPENWFFEKQGLVKbTDFGFSNK 
FOPG KKLTTS CGS LAY S APE I LLGDEY DAPA VD I W SLGV 3 L F ML 
VCGOPPFQEANDS ETbTMlMDCK YTVPSHVSKECKDLJ TRMLQR 
DFKRRAS LEE1 EKHPWLQGVDPSPATKYNIPLVSYKNLSEEEMN 
91 1 ORMVU3D3 ADRDAIVEALETNRYI^HITATYFLLAERILREK 
OFKE3 QTRSAS PSNI KAQFRQS WPTKI DVPQDLEDDLTATPLSH 
ATVPO 5 PARAADS VLNGHRS KGLCDS AKKDOLPEIiAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQVVbRRKPS 
VTNR bTSRKS APVLNQ I FEEGESDD E FDMDENL PP KLS R LKMNI 
ASPGTVHKRYHRRKSOGRGSSCSSSETSDDDSF.SRRRLDKDSGF 
TYS WHK RDSSEGPPGSEGDGGGOS XPSNASGGVDKASPS ENNAG 
GSSPSSGSGGU PTNTSGTTR RCAG P S NS MQLAS R S AGE LVESLK 
LMSbCLGSQLHGSTKYIlDPONGLSFSSVKVQEKSTWKMClSST 
GNAGOVPAVGG 1 KFFSDHMADTTTELERI KSKNLKNNVLQLPLC 
EKTI SVNlQRNPKEGbLCASSPASCCHVI 


5531 


24 


515 


GSOPRAPRPRDSMERPEPELIROSWRAVSRSPLEHGTVbFARUF 
ALE P D hh P bFQ YNCR Q FS S P F.DCT »S S P E FLDH IRK VMLV 1 DAAV 
TNVEDLSSLEE YLASLGRXHRAVGVKLS S FSTVGESLbYMLEKC 
LGPA FTPA TRAA WSQLYGA WQAMSRGWDGE 


5532 


339b 


1402 


SDKWVVGKRKMl lEDETEFCGSELLHSVbQCKSVFDVLDGEEMR 
RARTRANPYEKI RGVFFL>niAAMK>lANT4DFVFDRMFTNPRDSYG 
XPLVKCREAELMFADVCAGPGGFSEY^/bWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFEPYYGEGGIDGDGD3TRPENISAFRN 
FVtXNTDRKGWFLMAIX^FSVEGOENLOEILSKOLLbCOFLMA 
LS I VRTGGHFI CKTFDLFTPFS VGLVYLLYCCFERVCLFKP ITS 
R PAN S ERYWC KGLKVG 1 DDVRDYLFAVN I KLNQLRNTDSDVNL 
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IV 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Piedicrf c: end 
nucleoc : ce 
locctio:, 
♦rcrrespcricinrj 
to f irsi 
amino ac;c 
residue c: 
amino acic 
sequenci 


Amino acid segment containing signal peptide ~*j 
<A=Alanine, C = Cysteirm, D^Acpart ic Acid, E= 
Glutamic Acid, F=Phenylal amne . G^Glycine, 1 
H=Histidine, I = lsoleucine , K-Lysine, 
L=Leucine, M^Methicnine , N=Aeparagine , j 
F=Proline# P=Glut amine, R^Arqimne, ] 

Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Cnknown, * -Stop ! 
Codon, Apossible nucleotide deletion, 
Vpossible nucleotide insertion) j 








VVPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKJHAFVODTTL i 
SEFROAEIRKKCLRLWGIPD0ARVAPSSSDPKSKFFEL1OGTE1 ' 
DI FSYKPTLLTSKTLEKlRPVFDyRCMVSGSE0KFLIGU5KSQI j 
YTKDGROSDRWI KbDLKTELPRDTLLSVEIVHELKGEGKAORKI 
SAIHILDVLVI.WGTDV^^OHFNQRI QIAEKFVXAVSKPSRPDMN 
PI RVKEVYRLEEMEK1' FVRLEMK1 1 KGSSGTPKLSYTGRDDRHF 
VPMGLY1 VRTVNEPWTMGFSKSFKKKFFYNKKTKDSTFDIjPADS ; 
lAPFHIcyYGRLFWEWGDGIRVHDSQKPQDODKLSKEDVLSFlO 1 
MHRA 


5S33 






MKERRAPQPWARCKLVLVGDVOCGKTAML0VLAKDCYPETYVP ■ 
TVFENYTACLETEEQRVELSLWDTSGSPYYDNVRPLCYSDSDAV ' 
LLCFDISRPETVDSALKKWR1T.1EDYCPSTRVLLIGCKTDLRTD i 
LSTLMELSHQKOAPISYEOGCAlAKOl-GPEIYLEGSAFTSEKSl 
HS 1 FRTASKLCLN KPSPLPOKSPVRSLSKRLLHLPSRSELI SPT 
FKKEKAKXCSIM 


5534 


3 


60! 


LVKGRARAANPGRVGAMIX5LR0RVEiIFLEORNLVTEVLGALEAK ! 
TGVEKRYLAAGAVTLLSLYLLFGYGASLLCNLIGFVYPAYASIK j 
A3ESPSKDDDTVWLTYWV\ r YAbFGliAEFFSDLLl>SWFPFYYVGK 
CAFLL>FCMAPRPWNGALMLYORWRPLFLRHHGAVDRIMNDLSG 
RALDAAAG ITR NVX PSOTPQPKDK 


S53S 


102 S 


33; 


KSFMDSEARLCSLVEl^SuTODETQKSDSENEDLKlDCLQESOEL 
NLOKLKNSERl l.TEAKQKMREI/TVN 3 KMKEDL1 XEblKTGNDAK 
SVSKOYTLKVTKLEIIDAEOAKVELTETQKQLCELENKDLSDVAM 1 
KVKLQKEFRKKVDAAKI.RVOVLOKKOODSKKIASLSIQNEKRAN , 
ELEOSVDHMKYOKICbORKLOEENEKRKObDAVIKRDQQKI KVI 
LSYI PAKYNMKC ! 


5536 




28', 


AAATAAS LSPRGCR LRTPSSDVS PS RAPPPSAAPLPTGRAQMSP 
SGRLCLLTIVGL1 L ? TRGQTL XDTTS S S S AD AT I M D I QVPTRAP 
DAVYTEbQPTS PT P TW F ADET PQPQTQTQQbEGTDG PbVTD P ET 1 
HKSTKAAHPTDPTTTI^ERPSPSTDVQTDPOTLKPSGFHEDDPF 
FYDEHTLR KRGLLVAAVLFITGI 1 1 L TSG KC RQLS R L»CF NHCR | 


553/ 


3 


2351 


RMWSSPQLdWFRSGRPKRLRVbRJNRTSVALRJ.AGTGRFVAXT 
PGHPGSWEMGLLTKKDVAVEFSLEEWEHLEPAOKNLYODVMLEN 
YRNLVS LG LWS K PDLI TFLEORKE P WNV KS EETVAI QPDVFSH 
YNKTLLTEHCTEAS FQKVISRRHGS CDbENLHLRKRWKREECEG 
HNGCYDEKTFKYDCFDESSVESLFHQQILSSCAKSYNFDOY'RKV 
FTHCSLLNQQEE1DIWGKHHI YDKTSVLFRQVSTLNSYRNVFIG 
EKNYFICNNSEKTLNQSSSPKNHOENYFLEKOYKCKEFEEVFLOS 
MHGCE KCEOSY KCN XCVEVCTOS bKK J QHOT2H3 R EN S YSYN K Y 
DXDLSOSSNLRK03 IHNEEKPYKCEKCGDSLNHSLHLTQHQI IP 
TEEKP Y K WKECG fCV FNLNCSL YLTKQQC; I DTGENLYKCKACS KS 
FTRSSNLIVHQR3HTGEKPYKCKECGKAFRCSSYLTKHKR3H7G 
EKPYKCKECX?K7lFNRSSCLT0H0TTHTGEKLYKCKVCSKSYARS 

snli m1iqrvhtgek py kckecgicvf srs scltqhrx i htgbnly 
kckvcakpftcfsnlivherihtgekpykckecgkafpysshli 
rkhriwtgekpykckacsksfsdssgltvhrrthtgekpytcke 

CGKAF S Y S S DV 1 OH R R 1 HTGOR ? Y K CEECG KAFNY R S YLTTHOR 
SHTGERPYKCEECGKAFWSRSYLTTHRRRHTGERPYKCDECGKA 
FSYRSYLTTHRRSHSGERPYKCEECX;XAFNSRSYXIAHQRSHTR 
EKL 


tC ID 


J £.K> 


t <r - 
1 b J 


Jl wrjrini** i vO A r V ■ ipi 1 1| ii «i ivr i ■ i us 0\^ri\^ l_>o V_ lUrrnl tr o j. ru 

I PGTPGPDGOPGTPGI XGEXGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGFMGPKGGPGAPGAPGPXGESGDYKATQKJAFSATRTI 
NVPLRRDOT1 R FDHV I TMMNNNY E PR SGX FTCKVPGL YY FTYHA 
SSRGNLCVTJLMRGRERAOKWTFCDYAYWTFQVTTGGMVTiKLEQ 
GENVFLQATDKNSLLGMEGANS I FSGFLLFPDMEA 


5S39 


38 


12S*T 


H RG PSGAAAPGCALPRGQALEGPRS CRR PQ PMARRYDELPHYPG 
I VDG PAAIASFPET V PAVPGPYGPHR PPOPLP PGLPSDGLKREK 
DEIYGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCS 
SDSFNED1AAFAKQVRSERPLFSSNPELDWLVIQAIQVLRFHLL 
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r SEC 
ID 
NO: 

t 

1- 


Predicted 
beginning 
nucleot ict 
location 
corresponding 
to first 
amine acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ior. 
cor responding 
tc firct 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing siunal peptide 
(A=AUnine, C-Cysteme, P=Aspartic Acid, » 
Glutamic Acid, F^Phenyl &1 eninc , G=Glycine, 
H=Hieti.dine, 1 = } soieucine, K=L»ysine, 
L-Leucine., M=Methionine > H= Aspaxagine , 
P=Proline, Q=Glutamir.e , R-Arginine, 
S=Serine, T=Tnreonine. VrV&line, 
W*Tryptophan, Y=Tyrcsme, X^Unknown, * = Stop 
Codon, /=possibie nucieot:de deletion, 
Vpossible nucleotide insertion) 


i 

i 

i 
t 






ELEKViJDLCDNFCHRYlTCLKGKMPI DLVIEDRDGGCKEDF5DY 
PASCPSL.PDONNMWIRDKEDSGSVHLGTPGPSSGGLASQSGDNS 
SDgGPGLDTSVASPSSGGEDEDLDOER^RNKKRGIFPKVATNlM 
RAW1*F0HLSKPY PS EEQKKOLAODTGuTI LQVNNWFINARRR I V 
OPMIDQSNRTGCGA^FfPEGOPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWKYI* 


1 
j 


148 


24 4 C 


PPLGAGAGVHAnSPHPARRl»?LTTAGVGGRAPDLLPTFWRQHRG 
PSGAAAPGCALPRGOAXEGPRSCRRPO?MARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGFHRPPOPLPPGLDSDGLKRBKDEI 
YGHP1»FP LLALVF E KC ElATCS PRDG AG AGLGTP PGGDVCS S DS 
FNEDNTAFAKOVRSSRPLFSSNPELDKLMICA1QVLRFHLLELE 
XGXMP JDLVI EDR DGGCK EDFEDYPAS CPSLPDQIW I W I RDHED 
SGSVHLGTPGPSSGGLASOSGDNSSDOC-VGLDTSVASPSSGGED 
EDLDOEPRRNKKRGIFPKVATNIMRAWLFOHLSHPYPSEEQKKQ 
LAQDTGLT1 XjOVNWWFINARRRI VQPM 1DQSNRTGOGAAFSPEG 
QPIGGYrE7*EPHVAFRAPASVGDEFCTRKEEWHYL 


J S>541 

| 


143 


i^40 


PPbGAGAGVHAJRSPHPARRLPLTTAGVGGRAPDLLPTPWROHRG 
PSGAAAPGCAI.FRGOALEGPRSCRRPCPMARRYDELP^YFGIVD 
GPAAIJ\SFPETVPAVPGPYGPHRPPOFLPPGLDSDGLKREKI>EI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTA FA KQVH SERP LFSS K PELDN LM I QA1 QVLRFHLLELE 
KGKMPinLVlEDRDGGCREDFEDYPASCPSLPDONinWlRnHED 
SGS VH LG TPG PSSGGLAS05G DNS SDCK5VGLDTSVASPSSGG ED 
EDLDQEP R KUKKRG 1 FP KV ATN I MRAW LFOHLSHP Y PS E EQKKQ 
LAODTGLTi 1X?VNN WF I N AN R R I VQPK 1 DQSNRTGOCAAFS PEG 
QF1GGYTET2PHVAFRAPASVGDEFGTRKEEWHYI. 


| 5542 

i 


148 


144C 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWROHRG 
PSGAAAPGCALPRGQAJLL'GFRSCRRPCPMARRYDEIiPHYPGl VD 
C PAAJLAG FPETVPAVPG P YGPHRPPQPLP PGLDSDGLKREKDB I 
YGHPLFPLlALVFEKCEIATCSPRDGftGAGLGTPPGGDVCSSDS 
FNEDNTAFAKOVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDONNIWIRDHED 
SG$VHLGTPGPS?GG1*AS0SGDNSSDCGVGLDTSVASPSSGGED 
EDLDOEPRRMKKRGIFPKVATNIMRA^FQRLSHPYPSEEOKXQ 
LAQDTGLT I LQVNNWFI NARRR1 VQPM1 DQSNRTGQGAAF3PEG 
pPIGGYTETEPHVAFRAPASVGDEFGTRXEEWHYL 


^""5543 
i 


240S 




RWVREOPWPL.RT?EAVKTPALRPFPGPRGVSPFPKP n »GXSPAF 
KRPFSDSGAFWSPERRPGVLEAPRRRPVPASFRAVPPKPTRVHG 
SSASRDRVliARTMIVADSECRA.ELKDYl.RFAPGGVGDSGPGEEO 
KESRARRGPRGPSAFIPVEEVLKEGAESLEOHLGLEALMSSGRV 
DN1AWMG I.H PDY FTSFW R LHY LLLHTDGPEAS SWRHY I AI MAA 
ARHOCSYLVGSHKAEFLCTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHRPML1 TKEHIGAX.LKTGEKTWS1 AEL 1 QALVLLTHCHSIiS 
SFVFGCGI LPEGDADGS FAPCAPTPPSEOSSPPSRDPLNNSGGF 
ESARDVEALMERHOQLOESLLRDEGTS0EEMESRFELEKSESLL 
VTPSADIbEPSPHPDKLCFVEDPTFGYEDFTRRGAOAPPTFRAQ 
D YT WEDHG YS1> I OR LYPEGGQLL DE K F 0 AA YSLT YNT J AMHSG V 
DTSVLRRA1WNY1 HCVFGI RYDDYDYGEVNOLLERNLKYYIKTV 
ACYPEKTTRRMYTCLFWRHFRHSEKVirVNLLLLEARMOAALljYAL 
RAITRYMT 


5544 

I 
i 


1895 


524 


LGGLLGROR LLLRMGAGR LGAPMERHGRAS AT.S VS S AGEQAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVES LR KXR PLF PWFGLP3 GGTL VKLV YFEPXD1 TAEEEEEEV 
ES LKS I R KY LTS NVA YGS TG I RD VHLE L KDLTLCGR KGNLH FIR 
FPTHDMPAF3OMGRDKNPSSLHTVFCATGGGAYKFE0DFLTIGD 
LQLCKLCELDCLIKGILYJDSVGFNGRSOCYYFENPADSEKCOK 
LPFDLKN P YPLLL VN I GSGVS I LAVY S K DN YKR VTG TS LGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTiarDKLVRDlYGGDYBRFG 
LPGWAVAS E FGNKMSKEKREAVS KEDLARATLITI TNN1C-SI AR 
MCALNENI NOWFVGNFLRlNTIAMRLU^YALDyWSKGQLKALF 
SEHEGYFGAVGALUELLKI P 
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SEQ 
ID 

NO : 

1 


Predicted 
beginning 
nucleot ide 
location 
correcpondinc 
to first 
amino acid 
residue of 
amino acid 
sequence 


Freda cved end 
nucleotide 

1 OCot lOl. 

ccr responding 
tc iirst 
atr.anc <-cic 
residue ci 
amino acid 
sequence 


Amino acid sccnent containing signal peptide "~ 
(A=Alanme, C-Cynteine, D=Aspartic Acid, L- 
Glutamic Acic, 7 = Phenyl a! anine , G=Glycine, 
H^Hietidine, 7 -!! soleucine, K=Lyr,ine, 
L-Leuc^ne, M-Kct hi onine , N-Asparagine , 
P=Proline., Q = Gjur.amine , R=Arginme, 
S=Serine, T=Thi f-.onine, V=Valine, 
W=Tryptophan, "/^Tyrosine, X-Unknown, *-5top 
Codon, /-possible nucleotide deletion, 
\-possible nuc-jeotide insertion) 


554 5 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVTCGSVb 
KLLNTHURVRI.HSI YCl KYGSGSGQQSVTGVEASDDANSYWR3RG 
GSEGGCPRGSPVRCGCAVRLTHVLTGKNLHTHHFPSPLSNNOEV 
SAFGEDGEGDDLDI-.V.TVRCSGCHWEREAAVRFOHVGTSVFLSVT 
GEQYGSPIRGOHEVH0MPSANTHNTWKAMEGIF1KPSVEPSAGH 
DEL 


5S46 


1592 


lie 


W P R G GH S S MG0. S G P 1" RHOKRARAOAQLRNLEAYAANPHS FVFT 
RGCTGRS T IROLSLDVRRVMEPLTASRLQVRKKrv T SLKDCVAVAGP 
LGVTHFL3LSKTETKV-YFKLMRLPGGPTLTF0VKKYSLVRDWS 
SLRRHRMHEQOFAl^rtLLVLNSFGPHGMHVKLKATMFONLrPSI 
NVHKVMLNTI KRCLi: DYNPDSOELDFRHYS J KWPVGAHHGWK 
KLLQEKFPNMSRLOr 1 SELLATGAGLSESEAEPDGDHN3TELPO 
AV AG RGNMRA00 S A V R L»T E I G PRMTLQL I KVQEG VG EG KVM FH S 
FVSKTEEELQA1LEAKEKKLRLKA0RQAQQAQNVQRKOE0REAH 
RKKSLEGMKKAR VGCZDEEASGl PSRTASLELGEDDDEQEDDDI , 
EY FCQAVGEAPSEDLF PEAKQKRLiAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSOGA0ARRGPRGASRDGGRGRGRGRFCKRVA 


5547 


1592 


146 


FV PRGGHS^ MG0SG RSRHQ KRARAQAQLRKLEAY AANPHS FVFT 
RGCTGRNIROLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
L>GVTHFLILSKTETN'v r YFKLMRLPGGPTLTFOVKKYSLVRDWS » 
SLRRHRMHEOQFAHPPLLVl.NSFGPHGMHVKLMATMFONLFPSI 
NVHKVNLNT3 KRC1.LJ DYNPDSQELDFRHYS3 KWPVGASRGMK 
KLLQEKFPMMSRbCri S EU ATGAGUSBSEAEPDGDHN3 TEJoPQ 

avagrgnmraoosavrlteigprmtlolikvoegvgegkvwfrs , 
fvskteeelqa: leakkkiclrlkaoroaqoaonvqrkqeqreah ' 
r kks legm k kar vggs deeasg3 psrtaslelgednneqeddd3 
eyfcqavgeapsedlt peakqkrlaks pgrkrkrwemdrgrgrl 
cdokfpktkdksogaoarrgprgasrdggrgrgrgrpgkrva ; 


5548 


1 


215? 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPLAHALRGNETTA 
DSNETTTTSGPPDPGaS0PLIAWLLLPLLLL1»LVLLLAAYFFRF 
RKQRKAVVSTSDKKJSFNGI LEEQEOORVMLLSRSPSGPKKYFPI 
PVEHLEEEI R 3 RSADDCK0FREEFNSLPSGH1QGTFELANKSEN 
REKNRYPNILPNDHSRVlLSGLDGlPCSDYJNASyJDGYKFKNK 
F 3 AAQG PKQE TVN D FKR MVWEQKS AT 1 VMLTNLKERKEEKCHOY 
WPD0GCWYGN1RVCVEDCVVLVDYT2RKPCIOPOLPJDGCKAPR ' 
LVSQLHFTSWPDFG VP FTP I GMLKFhKKVKTLNPVHAGP I WHO 
SAGVGRTGTFI VI DAMMAKMIIAEQKVD VFEFVS R I RNQR PQMVQ 
TDMQYTFI YOALLEYVXYGDTELDVSSLEKKLQTMHGTTTHFDK 
IGLEEEFRKLTtfVR IMKENMRTGNLPANMKKARVIQI IPYDFNR 
VILSMKRGQEYTDY INASFI DG Y RQKDY F I ATOG P LAHT V E DFW 
RM7W EWK SHTI VMI/J'EVQE R EQDKC YQ Y W PT EGS V THGE J T I El 
KNDTLSEAI S I RDFLVTLNQ PQARQE EQVP VVRQ FH FHG W P E I G 
I PAEGKGMI DL3 AAVOKOOOQTGNHP ITVHCSAGAGRTGTFI AL 
SNILERVKAEGLLDVFCAVKSLRLQRPHMVQTLEOYEFCYKWO 
DF3DI FSDYANFK 


554 9 

1 


915 


256 


FF^TGGKRI^FKMAGTA^HDREMAIOAKKKLTTATDPIERLRLO 
CLARGSAGI KGLGRVFR IHDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFOR FDKDGNGT3 DFNEFLLTLRPPKSRARKEV1 MQAF 
R KU)KTGDGV I TI F.DLR EVYNAKHHPKYQNGEWS EEQVFRKFLD 
NFTDSPYDKDGLVTPEEP'I'INYYAGVSASIDTDVYFI IMMRTAWKL 


5550 

! 


2364 


1 2 1 C 


t\ (VIA AV f i_l IM IA f> JjIX rv A r\ i JJo JL# ■ K tr f\ v rco I * Jw X *Jnm\J\y k V 

SL1AFTTMALLTJ MEFSVYQDTWMKYEYEVDKDF5SKLR 3NIDI 
TVA^KCOY VGADVLDI .A FTMVAS ADGLVYEPTV FDI .S PQOKE WQ 
RKLQL30SRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSCSPN 
AC^IHGHLYVNKVAGNWITVGKAIPHPRGHAHLAALVNHESYN 
FSHR J DHLSFGELV PAI 3NPLDGTEK 1 A3 DUNQM FQY Fl TW PT 
KLHTYK I S ADTHQFS VTERERI 3 NHAAGSHGVSG 3 FMKYDI^S S h 
MVTVTEEHMPFWQ F FVRLCG I VGGI FSTTGMLHG 1 GKF3 VE 3 1 C 
CRFRLGSYKPVKSVPFEDQiTDNHLPLLENNTK 


5551 , 211 


1-700 


K0RDHTMDYKESCPSVS3FSSDEHREKKXRFTVYKVLVSVGRSE 
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SEC 
ID 
NO: 

i 


Predi crec 
beg inn: ric 
nucleotide ' 
locct icr. 
corresponds :ic 
to firs*, 
ammo z-.cia 
residue oi 
amino acic 
sequence 


Precicted end 
rmclcot ide 
) ocat ion 
corresponding 
to tirst 
amino acid 
residue oC 
amino acid 
sequence 


wnino acic sccmer.t containing sacnal peptide 
(A=Al enine , C- Cysteine, D=Aspartic Acid, 5* 
Glutamic Acid, F = Phenylal anine , G=Glycme. 
H-Kistidine# I - iso] eucine, K=Lysm^ , 
i. -Leucine, tt=K.^thi onine, K-Asparanint, 
PiFrciint, Q=Gi utami ns, R=Arginine , 
S -Serine, T-Threor. :ne , V- Valine, 
V.'^lryptoprtcin, Y-Tyroaine, X=Unknovn, *=-Stop 
Codon, /^possible nucleotide deletion, 
\=po5sible nucleotide insertion) 








\n FVFRRY AEFDKLY NTLKKQFP AMALK 1 PAKR1 FGDNFEPt)F I K 
U« KAGl.NbFj (jTiu \ « iPELYNHPDVK-Ai' LQML?SPX>fC>SDPSfc.DE 
D^USSOKbMCTSOtvlNbGPSGNPHAXPTDFDFLKVIGKGSFGKV 
LLAKR KLDGKF YAVKVLQKKI VLNRKEQXHI KAERNVLLKNVKH 
P FLVGIjHY SFCTTEKLYFNaDFVNGGELFFHLOR ER SFPEHRAR 
F YAAE 1 ASALGVI.KS I Kl VYRDLKPEN I LLDSVGK WLTDFGLC 
KF.GIAISDTTTTFCGTPEYLAPEVIRKQPVDNTVDWWCIJGAVLY 
EMLYGLPPrYCRDVAEMYDNJLHKPLSLRPGVSLTAWSJLEELL 
ElORQN R J jG AK E D F LE 1 QNH P F FES LS W ADUVQKK I P ? P FN PNV 
AGPDD3 RNPDTAKTEETVPYSVCVSSDYSI VNASVLEADDAFVG 
FSYAPPSEDLFL 


5552 


274 Z 


930 


tG PAAGAAMGK.KJ: K KHKAE WRSS YED Y ADKPLE K PLKL, VLK VGG 
SEVTELSGSGKDSSYYDDRSDHERERKKEKKKXKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQP^EMESTPIQOLLEHFLROlWyOFHGFFAFPVTDA 
1APGYSMI 1 KMPMDFGTMKDKI VANEYKSVTEFKADFKLMCDNA 
MT Y NRPDT VY Y KLAKKI LHAG FKMMS KQAALLGNE DTAVEEPV P 
EWPVOVETAKKSKKPSREVISCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSWNTAEP 
DADEE ETH P VD LSS LS S K LLPG FTTLG FK DER RNK VTF1>SSATT 
ALSMQNNSVFGDLKSDEMEL1JYSAYGDETGVQCALSI.0EFVKDA 
CS YSKKWDDLL.DO: TGG DHS R Th FQl < KQ R RNV PM K P P DE AKVG 
DTJiGDSSSSVLEFKSMKSYPDVSVDISMLSSIjGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDtWHL 
GS PSRLSVGEOPDVTHDPYEFLQSPEPAASAKT | 


5553 


1*. 


1095 


l^REAVYLV£RMDGPVAEHAKOEPFHVVTPliLESWAl,spVAGMP 
VFLKCENVOPSGSFK1RG1GHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLG I FAT I V LPES TS LQWQR LQG EGAE' VO L TG K VKD 
E7^N LRAOEl^AKRDGWENVP P FDH PLI WKGHAS LVQE LKAVLRTP 
PGALV1 ^VGGGGLLAGWAGLLEVGWQHVPJ J AMETKGAHCFNA 
A I TAGXLVTLPD1 T£ VAKS LGAKTVAARALECMQVCK 1 HSEWE 
DTEAV£AVQ9LL»DDERNl>VEPACGAALiAAI YSGLbRRLC>AEGCl> 
PPSLTSVW 3 V CGGKN 1 NS R ELQALKTl IL<5QV 


5354 




2310 


CSGR TGG RGSLR PAEN VCLTCK LSGAETRGLLCPALR TWj MKVL 
GRSFFWVLFPVbPWAVQAVEHEEVAQRVlKLHRGRGVAAMOSRO 
>AT?DSCRKL£GLLROKNAVLNKLKTAlGAVEKDVGl>SDEEja,P0 
VHTFE I FOKELNESENSVFOAVYGLORALQGDYKDWNMKESSR 
QRLEALRE7\AIKEETEYMELL«AAEKH0VEA1>KNMC>HC>NQSLSML 
DEILEDVRKAADRLEEEIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANSK0N1TKREVEDDLG-LCMLIDSQNWUYILTKPRDST3PRADH 
HF3 KDIVTIGKL>SLPCGWLCTAIGLPTMFGY1 2CGVLLGPSGLN 
S3 KS I VOVETLGEFGVFFTLF LVGLEFS PEKLRKVWKI SLOGPC 
YMTLLMI AFGLL,WGHL,LRI KPTQSVFI STCLSLSSTPLVSRFLM 
GS ARGDXEGD3 DY STVLLGMbVTQDVQLGLF KAVMPTLI OAG AS 
ASSS I WEVLRI IWLIGQI LFSLAAVFLLCLVI KKYLIGPYYRK 
LHMRSKGNKE3 LI LGI SAFI FLMLTVTELLDVSMEU5CFLAGAL 
VSSCGpWTEE 1 ATS I EPI RDFLAI VFFAS IGLHVFPTFVAYEL. 
TVLVFI*TLS V\TVT^K?LLAAi.Vl»SIiII»PR S SOY I KW I V S AGLAQV 
S2FSFV1^SRAFRAGVISREVTLLII^VTTLSLLLAPVLWRAAI 
TRCVPRPERRSSL 


5555 


212 


1425 


LSIJ^TRETPAPPRCEAASG^RVGWFIADAAAEEAVRSVWNRTRDR 
GTMA PQNLS TFCL LLL YL1 G A V I AGR D F Y KI LG VPR S AS I K£> I K 
KAYRKI^OLJIFDWJPDDPOAQEKFQDLGAAYXVliSDSEKRKQY 
DTYGEEGLXTJGKOSSHGDIFSHFFGDFGFKFGGTPROODRKIPR 
GSD 1 1 VDLE VTLE E VY AGN F VEVVRNKP VARQA PG KR K CNCR QE 
MRTTOLGPG^FOMTQE VGCDE CPNVXLVNEERTLE VE1 EPG VRD 
GMEYPF3GEGEPHVDGEPGDLRFRIICVVKHPIFERRGDDLYTNV 
TISLVESLVGFEMDZ THLDGHKVHISRDX1 TRPGAKLWKKGEGL 
PNFDNNNI KGSLI 3 TFDVDFPKEQLTEEAREGlKObLKQGSVQK 
VYNGLQGY 


5556 


5835 


3346 


RTRGMS KNCVPME FEE YbL^MFOGTFYLLOKI TKDNNAHTVKSR 
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SEQ 

in 

NO: 


Predictec 1 
beg inn in? 
nucleot idt 
location 
corresponding 
tc first 
amino acic 
residue oi 
amino acid 
sequeiiCF 


Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
recidue of 
amino acid 
sequence 


A*ni:ic eciri ;3eyment containing signel peptide 
(A=Aifcnine, C=Cysteine, D^Aspartic Acid, 
Glutan.-c Acic, F=Phenylalanine, G=Glycine, 
H=Hist:dine, I-I soieucine, X-Lysine, 
L=Leucine, M=Methionine , N=Asparagi nc , 

Prohne, C=Glutamine , R=Arqinine, 
S-Senne, T- Threonine , VsValane, 
K= Tryptophan, Y=Tyrosine, X= Unknown, *=Stcp 
CoOon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEELDESYIEXFTDFLRLFVSV^LRRIESVSQFPWEFLTLLFK 
YTFHOF7HEGYFSCLDIWTLFLDYLTSK3XSRLGDXEAVLNRYE 
DALVLLLTEVLrCKlOFRYNQAO^EELDDETLDDDOOTEWORYLR 
OS LEVA'AKVMELLPTHAFSTLFPVLQDNLEVYLGLQQFIVTSGS 
GHR 1 TAENDCRR LH CS LRDLS S LLCAVGRLAEY 71 GDVFAAR 
FITOALTWERLVKVTLYGS02 KLYNI ETAVPSVLXPDLI DV31AQ 
SIiAALOAY S HWl iAQYCSEVHRONTOOFVTJj I STTMDAI TPL1 ST 
KVQDK L LLS ACHLLVS LATTVRPVF LI S I FAVQKVFNR I TDASA 
LR LVDKAQVL v CRALS Nl LiLLP W PNLPfcr» r.QQWP VRS 1 NHASLI 
S ALSR DY RNLX PS A VAPQRKMPLDDTKL 1 1 HQTLS VLED1 VEN I 

sgestksrqjcyosloesvovslalfpafjhosdvtdemlsffl 

TLFRGLRV0MGVPFTEQI1QTFLNMFTRE0IAES3LHEGSTGCR 

wekflk1lovw0epgqvfkpflpsiialcme0vyp31aerps 
pdvxaelfellfrtlhhnwryffkstvlasvorgiaeeomenep 
ofsai moafgosflopdihlfkqnlfyletlntkqklyhkki fr 
tamlfqfvn vllqvlvhkshdllqee ig 3 a: ynmasvdfdg ffa 
aflpe flts cdgvdanoxs vlgrnfkmdr v r rergraxr raewa 
r k pgtcaarrg h 3 easgrglcppcslaaah em padlvl 


5557 


1712 


491 


VI1jGAGLRDKDMWIPWGLPRRLRLSALAGAGRFC11.GSEAA7R 
KHLPARJ^HCGLSDSSPnLWPEPDFRNPPRK/iSKASLDFXRYVTD 
RRLA5TLAQlYLGKPSR?PHLU»EC^PGPGlliT0ALbEAGAKW 
AI.ESDKTF3 PHLESLGXNLDGXLRVIHCDF FXLDPRSGGVI KPP 
AMSSRGLFKNLGlEAVpWTADIPLXV^/GMFrSRGEKRALWXLAY 
DLYSCTS3YXFGRIEVNMFJGEXEFQKXMADPGNPDLYHVLSV1 
W0IACE1KVLHMEFWSSFD1YTRXGPLENPKRRSLLD0LQOKLY 
LlOm PRONLFTKNLTPMNYN3 FFHLLXHCFGRRSATVI DHLRS 
LTPLDARDlbMOlGKQEDSKWNMHPODFKTLFETIERSKDCAY 
KWLYDETLEDF 


555E 


15 OS 


96 


RAGCrHPUVPADLGAPAEPRRPOXTCVCLLOPQPGGORGPTTMI 
TGVFSMRLWTPVGVLTSLAYCLHORRVALAELOEADGQCPVDRS 

llklxmvqvvfrhgarsplkplpi^eeqvewnpqllevppqtqft) 
ytvtn^ggpxpyspydsqyhettlkggkfagqltkvgmqomfa 
lgerbrkny\ ; edlpflsptfnpoevfirstni frnlestrclla 
glfocqkegpi i ihtdeadsevlypnyqscvslrqrtrgrrqta 
slopg 1 sedlkkvkdrmgi dssdkvdffilldnvaaeoahnlps 
cpmlkr farm i eqravdtsly 1 lpxedres lomavgpplh i l.es 
nllkamdsatapdk1rklylyaahdvtfipllmtlgifdhxwpp 
favdltxelychleskewfvqeyywgke0vprgcplx5lcpldmf 
ln ams vytls pe xyhai>cs qtovl'isvgnee 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 
ELDWPPDGS VPVGLRQRNQTEKOSTG VYN K EAMLNFCEKETKK 
LMQREKSMDESKOVETKTDAKNGEERGRTJASKJCALGPRRDSDLG 
KEPXRGGLKKSFSRJJRDEAGGKSGEKPKEEX1 IRGIDXGRVRAA 

MKEVAKKEDDEKVKGERRNTDTRXEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNEPLKEKEAKBESKTKTPEKQTPSG 
PTK PS EGPAKVEESAAPS I FDEFLERVKNNLPEMTEVNVNNSDC 
ITKEILVRFTEALEFNTWXl>FALAOTRADDhrVAFAlAIMLXAN 
XT I TS LNLDSNH 1 TG KG I LAI FRALLQNNTLTELR FHNQRH I CG 
GKTEMEIAKLLKENTTLLKLGYHFELAGPRMTVTNLLSRNMDKO 
DnwDi npnpniinPiivrPvvnT T.PVPif J^nAVAKGSPXPSFOP^PK 
PSPXNS PXKGGAPAAPPPPPPPLAPPL I MENLKNSZ.SPATQRKM 
GDKVL PAQEXNSR IDQLLAA r RSSNIiKCLXKA'E VPKbLQ 


5S60 


9 


921 


SSV^EFSALSVSMACl^SPSOI^KFOODGFLVLEGFLSAEECVAM 
QQR 1 GE 3 VAEMDVPLHCRT^FSTOEEEOLRACGSTDYFLS SGDK 
IRFFFE KGVFDEKGNFLVPPEXS I NKIGHA1>HAHDPVFKS 1 THS 
FXVQTL.AR S3 jGLQMPWVnsMY I FKQPHFGG EVS PHODAS Fl»YT 
EP3J3R\TLG Wl AVEDATLENGCXWF J PGSHTSG VSRRKVRAPVG 
S AFGTS FLGSE PARDNSLFVPTPVORGALVL I HGEWHXS XQNL 
SDRSROAYTFHbKEASGTTWSPEKWLQPTAELPFPOLYT _ 


5561 


2175 


1775 


CYF1FOFFSSPYPGLHPKQTPAPLPNPGLYPPPVSMSPGOPPPQ 
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ID 

MO: 


r:\jcieot : de 
locat 30: 
cor resci. ncino 
tc fir*' 
en* j no a: .:c 
rc-fidue ci 
srr.ino acic 


Frediclec end ! Ammo cc:d segment containinc signal peptiae 
nucicctide i iA-Aiar;:ne, C^Cysteine, D^Aspartic Acid, E* 
locator) ; Glutamic Acid, F=Phenylal an) ne , G=Glycine, 
cor reaper, ding 1 H-Hi s t id ine, I =• Jsoleucine, K=Lys:inc, 
to f irst : L--- Leucine, M=Methionine, N^Asparagint , 
ammo 3cid j P-Prclir.e, Q=G3 utaminc , R=Argininc, 
residue cl !i = Se.rinc, T^Threonine, V=Valine, 
anuno acid ; w = ":': ypicphan, Y=Tyrosine, X*Unknown, »=Stop 
sequence 1 Coder., /^possible nucleotide dejeticr., 
1 \-possible nucleotide insertion; 






| 0L«L?*VTV F SAPGW.NFGN?SYPYAPGALPPPPP?HLYPNTQAPS 
| C V YG CVT Y YN PAQG/QVQPKPSPPRRT PC 1 P VTI KPPP F E VVSRGS 

i f 


5~bG2~~ 


34. 


;365 ! 55GKMO^j^AGAAGLVRGLKAGVLS0A£?YLNbVOCfrrLLDLKLH 
) LOS7DYC-N FLANEASPLTVSVlDDRLKEKMVVEFRHKRNliAYEP 
i LASFbDFITySVMIDNVILLITGTLHORSIAELVPKCKPLGSFE 
| CK EAVN 1 AQT PA c L YNA 1 LVDT PLAA FFQDC I S EQDLDE MN I E 1 
1 JRNTLrYKAYLESFyKFCTLLGGTTADAMCPI LEFHADRRAFI IT 
| IKSFGTEbSKEDRAKLFPHCGRLYPEG^OLARADDYEOVKNVA 
1 DYYPEYKI.LFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNOFHF 
\ GVFYAFVKLKEOSCRJUWlAECIAQRHRAXjnNYIPlF 






I3B5 


SSGYJJDyj\7'JkG^GLVRGLKAGVIJ>QADYLNbVQCKTLEDLKLH 
LO STDY GN P1AN E AS PLTVS VI DDRLK EKVATVEFRHMRNliAYEP 
LASFLDrl TYSYMIDNVILLITGTLHOnS JABLVPKCJIPLGSFE 
OKEAVNI AC'TPAELYNAllATDTPLAAFFQDCl SEQDbDEMNI El 
IRNTLYKAYLESFYKFCTLU5GTTADAMCP1 L.EFEADRRAFI IT 
I N S FG TELSKEDRAKLFPHCGRLYPEGL.^OLARADDYEOVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEUEVKLNKLAFJLNOFHF 
GV FY A FV K LKKQECRN 1 VW I A£C 1 AQR HRAK 1DNY1 PIF 







924 


RVRRDKRAVWTARGRRRCGDSMSGGWMA0VG7.WRTGALGLALLL 
LbGLGLGbFAAASPl^STPTSAQAAGPSSGSCPPTKFOCRTSGLC 
VPLTK R CDR DLDCE DGS DEEECR I E PCTQ KGCCP P P PGLPCPCT 
GVSBC SGGTDKKbRNCSRLACIAGELRCTI .SDDCI PLTWRCDGH 
PDCPDSSDELGCGTNEHiPEGDATTMGPPVTLESVTSLRNATTM 
GPPVTLESVPSVGNATSSSAGDQSGSPTAYGV1AAAAVLSA5XV 
TATLLLLSVJLRAOERLRPLGLLVAMKESLLLSEOKTSLF 


5565 




i*e 


RKNSPNPARAGSISRPCRAPGSVSAVA14TAAVFFGCAFIAFGPA 
1ALYV FT I ATE PLR 1 1 FL1 AGAFFWLVSLLI SSLVWFMAR V I ID 
HKDCPT0KYLLIFGAFVSVY1QEMFRFAYYKLLKKASEGLKSIN 
PGETAPSKRLliAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 
GDSPQFFLYSAFMTLVI 1 LLH VFWG I V F FOGCE KK K WG I LL I VL 
LTHLIA'SAQTFI SS YYGJNLASAFI I LVLMGrWAFLAAGGSCRS 
LKJjC LLCC D KN FLL YNQRS R 


5566 


204 > 


1232 


SHICH>1GRGAQAPVKMVSWMISRAWLVFGMI>YPAYYSYKAVKT 
KNVKE Y VR WMM YV1 1 V FALYTV I ETVADOTVAWFPLY YELK I AFV 
J WI-l-S V Y T KGASLI Y RKFLH PLLSSKERE I DDY I VQAKERG YET 
MVNFGRQGbNLAA?AAVTAAVKSCGAI TER LRSFSMHDLTTI OG 
DEPVGORPYOPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 
EGPYSPNEMLTKKGrRRSOSMKSVKTTKGR KEVRYGSLKYKVKK 
RPOVYF 


5567 


1554 


233 


EFLrGSG VS PDLANEDGLTALHOCCIDDFREMV0OLLEAGANI NA 
CDS ECW TPLHAAATCGHLHLVELLI ASGA-N LUAVNTDGNMPYDL 
CDDE0TLDCLETAMAJDRG1TODS I EAARAVPEbRMLDD I RSR IX? 
AGADLHAPbDHGATLLHVAAANGFSEAAALL.liEHRASLSAKDQD 
GWE PLKAAAYWGOVPLVELLVAHGADLNAKS LMDETPLDVCGDE 
EVRAXL LE L KH KJH D ALLRAQS RORSLLR R R TS SAGSR GKWRR V 
SLTQRTDLY R KOHAQEAI VWQOPP PTSPE P PEDNDDRQTGAELR 
PPPPEEDNPEWRPHNGRVGGSPVRHLYSKRLPRSVSYQLSPLD 
S TT PHTLVHDKAHKTLADLKRQRAAAKLOR PP PEG PES PETAEP 
GLPGDTVrPQPDCGFRAGGDPPLLKbTAPAVEAPVERRPCCLLM 


556 8 


173i 


587 


AEDKQPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTAI.L 
5LuVSGPRl>FLLCX}PLAPSGLTLKSEALRNW0VYRI>VTYIFVYE 
KPI SLLCGA 1 1 IKRFAGNFERTVGTVRHCFFTVI FA I FSAIIFL 
S FEAVSSLS KLGEVEDARGFTPVAFAMLGVTTVRSRMRRALVFG 
M W PS VX.VPWLLLGAS WLIPQTS FLSNVCGLS I GLAYGLT YCYS 
IDLSERVALKLDQTFPFSLMRRISVFKYVSGSSAHRRAAOSRKL 
NPVPGSYPTO$CHPHI>SPSHPVSQT0HASG0KIiASKPSCTPGHM 
FTLPPYOPASGLCyVQNHFGPNPTSSSVYPASAGTSIXSlOPPTP 
VNSPGTVYSGALGTPGAAGSKESSRVPMP 


556 5> 


2 


835 


OTPCPLAWERGSRSEDISVPGOKPPTCSEFSGKDVGPSSLPHLG 
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ir 1 

NO : 


Predicted* 
beg inn a ri« 
nucieot icie 
3 oc«n t ion 
corresponding 
to first 
amino acic 
residue c: 
amino ncid 
sequence 


Precictec end 
nucieot ice 
locat i cr. 
corre&po.ndin? 
to firs; 
amino acid 
residue ci 
amino acid 
sequence 


Amino r>c:c secment ccr.tf.::;;ng signal pcptiae j 
(A-Alanine, OCysteine, D-Aspart l c Acid, E= j 
Glutamic Acid, F^Pheny} a ianin^ ( G=Glycinc- ( ! 
K=Histicir.c , I-Iioleucir.t, K-Lysine, | 
L-Leurinc, M-Kethiom ne. . N- Asparagine , 
PeProline, Q=GluU:mmfc , R-Arginine, 1 
S=Serine, T= Threonine . v= valine, 
W=Tryptophan, Y=Tyrosine . X-Unknov/n, *=Stop 
Corion, /^possible nucleotide deletion, 
\npossible nucleotide insertion) 








LKLLLLLLLLPLRGOANTGCYG1J-GMFGLPGAPGKDGYDGLPGP 
KGEPGIPAIFG2RGPKG0KGEPOLPGHPGKNGPMGPPGMPGVPG 
PMGIPGEPGEEGRYKOKF0SVrTVTROTHOPP7iPNSLlRFMAVL 
TNPOGDYIXI'ETGKFTCKVPGLY^FVYHASHTANIjCVLLYRSGVK 
WTFCGHTSKTK0V?JSGGVLLRLQVGFrVWLAVNDYYDMVGIOG 
SDSVFSGFLLFPD 


&S70 


264 


94e 


rdrrdrggvatsteefarprapcsrcpgpvsqtgrgrergggdt 

MS£PSPGKKRHDTDWKLIESK>JEVT J IjGGLNEFWXFYGPQGT 
P YEGGVW KVR VDLPDK YP FKS PS j G FMN KI FHPN I DEASGTVCL> 
DVJNOTWTALYDLTNIFESFLFO^LAYPNPIDPLNGDAAAMYLH 
RPEBYKQK IKEY jQKYATEEALKFOEEG'JGDSSSESSMSDFSED 
EAQDMEL 


SS71 


264 


94t 


rdrrdrggvats~ekfaj*prapc>srgpgpvsqtgrgr£rgggdt 
msspspgkrrmdtd\a/kl:eskhf.vti!^glnefwkfygpogt 
pyeggwjkvrvdlpdkypfkspsigfmnkifhpnideasgtvcl 

DVIN0TWTALYDLTN1FESFLPC>L1^AYPNP1DPLNGDAAAMYI>H 
RPEBYKOKlKEYlOKYATEEAiKfcQEEGTGDSSSESSMSDFSED 
EAQDMEL. 


SbV2 


280; 


206h 


RTDY R TG 1 PC R R FR VMAAGDGDVK LGTLGSGS ES SN DGGS ES PG 
DAGAAAEGGG WAAAALALLTGGGE ML1 .WALVA1WLLGAYRLWV 
RWGRRGbGAGAGAG E ES PATS LPh MK KR DFS LEOLR OYDG S RNP 
RIUiAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGLATFCLD 
KDAI.RDEYDCLSDLNAVOMESVRFWEHOFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKOD 


"SB7 3~ 


2bc: 


21<- 


V P AR T PN AEPOG P E ARAATAT PCC S GG R ERAGEAAEDG VKMAAF 
SEMGVMPSlAOAVEF.riDWLLPTDlOAFS I PLIbGGGDVI>tAAET 
GSGKTGAFS I PV1QI WETbKDOCEGKXGKTTI KTGASVLNKWQ 
MNPYDRGSAFA 1 GSDGL-CCQSREV KEWHGCRATKGLMKGKHY YE 
VSCKDQGLCR VGWS TMQASLDLGTDKFG FGFGGTGK KSHNKQ FD 
NYGEEFTMHDT IGC YLD1 DKGKVKFS KNGKDLGLAFE1 PPHMKN 
OALF PACV LKNAEbKFN FGEEEFK F P P JCDGFVALSKAPDG Y I VK 
S^HSGNAOVPOTXFLPNAPKALIVEPSRELAEOTLNNIKOFKKY 
IDNPKLRELL31GGVAARD0LSVI.ENGVDIWGTPGRLDDLVST 
G KLNLS OVRFLVLD EADGbbSQG YS DF I NRMHNQ I PQVTS DG KR 
LQV I VCS ATLH SFDV KKI ,S EKI Kn F PTWVDLKGEDS VPDTVHHV 
WPVWPKTDRLWERLGKSHIRTDDVHAKDNTRPGANSPEMWSEA 
IKILKGEYAVRA1KEHKMDOAI2FCRTKIDCDKLEQYFIOOGGG 
PDKKGHOFSCVCLHGDRKPHERKCNLER FKKGDVRFLI CTDVAA 
RGID1 HGVPY VI NVTLPDEKQNYVKR1 GRVGRAERMGbAI SLVA 
TEKEKWYHVCSSRGKGCYNTRLKEDGGCTlKWEMQLLSEIEE 
HLNCT3SQVEPD3KVPVDEFDGKVTYGQKRAAGGGSYKGHVDIL 
APTVQEbAAXE KEAQTSFLHLG YLFNObr RTF 


5574 


1731 


9Si 


NEGLEVFKE0ELQPHDKGAVPEDASTERSAMASLGLQLVGY1LG 
LLGbbGTLVAKl.LPSWKTSS YVG AS 2 VTAVGFS KGbKME CATH S 
TGI TOCD I YSTbIX5LPADIQAAQAttMVTSSAISSLACI IS WGM 
RCTVFCOESRAKDRVAVAGGVFFILGGLLGFIPVAWNLHG1LRD 
FYSPLVPDSMKPEIGEALYLGI1SSLFSLIAGIILCFSCSC0RN 
RSNYYDAYQAQPLATRSS PR PGOPPKVKSEFNSYSLTGYV 


557S 


45G 


766 


LLWAJjPCPPPTAAAVljl^STGLMELLEKMI^TIAKADSPRTAL 
bCSAWLbTAS FSAQQH KGS LQKDPL LS QAC VGCLEALbDYbDAR 
S PDIGRNS PHYLMFP 


5576 


249 


2146 


RSWGAPW FWRMRLbRR RRMPLR LW< VGC AFVLFLFLLHRDVS S R 
EEATEKPWLKSLVSRKDHVLDLMLEAMK^LRDSMPKLOIRAPEA 
QQTLFSI NQSCLPG FYTPAELKPFKERF PQDPNAPGADGKAFQK 
SKWTPLETOEKEEGYKKHCFNAFASDR I SLQRSLGPDTRPPECV 
DOKFRRCPPLATTSVIIVFHNZAWSTLLRTVYSVLHTTPAILLX 
E 1 1 LVX)DASTEEHLKEKLEQYVKC?bQVVRVVRQEERKGLITARL 
LGASVAQA2 VbT FLDAHCECFHGWbEPbLARI AEDKTVWSPD I 
VTIDLNTFEFAKPVQRGRVHSRGNr DWS LTFGWETLPPHEKQRR 
KDETYP I KS PTFAGGLFS 3 S KSY FEK 1 G TYDNQME I WGGENVEM 
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SEC 
NO: 


P redicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
iocc>t ior. 
cci r fcsponding 
to tirs; 
amino acid 
residue of 
amj.no acid 
sequence 


Amino acid \* or.ent containing sjgna} peptide 
(A=Aianinc, ^Cysteine, D^Aspartic Arid, E^ 
Glutamic Ac:;:. F=Phenyl alanine, G=Glyc:ine 
H.= Hif t idine , 3 -Isoleucine , X^Lysine. 
^Leucine, f-.= methionine, N--Asparagine . 
P=-Prolina, C-Glutamine, R=Arginine, 
S-Serme, T- "".'.reoninc , V=Valine, 

Tryptophan, Y=Tyrosixe, X=Unknown, *-£Lop 
Codon, /=pcrj:ble nucleotide deletion, 
\=pcssible r.-. clcotide insertion) 








SFRVWQCGGQLF : 1 FCSWGHV? RTXSPHTFPKGTSVI ARNQVS 
LAEVKMDSYKKI Y V RRNLOAAXMAQEKSFGDI SERLQLREQLHC 
HNFSWY LHNVYP 1 .'.FVPDLTPTFYGAJ KNLGTN0CLDVGEN7JRG 
GKPL1MYSCHGLC-CNOYFEYTTORDI»KHNIAKOI*CLHVSKGALG 

lgschftg knsc '• h xdeewebaqdql jrjssgsgtcltsqdkkpa 
mapcnpsdphqlv;lfv 


5577 


3 


127S 


RNSDCSCGEISVyri.PWVLFl LDLKVESSMFCFLKLI LLPVLLD 
YSLGLNDLNVSF ? r.LTVHVGPSALMGCVFQSTEDKCI FKIDWTL 
SPGEHAXDEYVLVVYSNLSVPTGRFQNRVULMGDILCNDGSLLL 
QDVQZADQGTYjCLjRhKCESQVFKKAVVLHVhPELPKShKVHV 
GGLIOMGCVFOS': LVKJIVTKVEWI FSGRRAKEE1VFRYYH.KLRM 
SVEYSQSWGHFCKfVNLVGDI FRNDGS IMLQGVRESDGGNYTCS 
IHLGNLVFKKT3 V^HVSPEEPRTLVTPAAXRPLVLGGNQLVl IV 
GIVCATILLLPV:. : LIVKKTCGNKSSVNSTVLVKNTXKTNPEIK 
EKPCHFERCEGEKK J YSPI I VREVI EEEEPSEKSEATYMTMKPV 
WPSLRSDRKNSLE r.KSGGGMPKTQQAF 




3 


782 


AVESMASPGACRA! PELPERNCGYREVEYWDQRYQGAADSAPYD 
VJFGDF£SFRAL»LE; EIjRPEDRILVLGCGNSALSYELFLGGFPNV 
TSVDYSSVVVAA^X'ARYAHVPOLRWETMDVRKUDFPSASFDVVL 
EXGTLDALUvGFRi'PWTVSSEGVHTVEQVLSEVSRVLVPGGRFI 
SMTSAAPHFRTRK-AQAYYGWSLRHATYGSGFHFHLYUfHKGGK 
LSVAQI.ALGAQI hi- PPRPPTSPCFLQDSDHEDFLSA1QL 


3 


1540 


RNSGLARGASAL/-.^HGGGLAGGVGWDCGACASRCOGVHEGLLTR 
CRALPAJ^ATCSRC 1 SGYVPCRFHHCAPRRGRRLLLSRVFQPCNL 
REDRVLiSLODKSCr. LTCKSQR^MLOVGLI YPASPGCYKLLPYTV 
RAMEK LVR V 1 upEl* QA I GGOKVNMPS LS P AE 1>VJQ ATN R VI DLMGK 
ELLR LRDRHG KE V ( bG PTHEEA I TALI ASQKXLS YKOL P FLLYQ 
VTRKFRXEPRPR^LLKGRhrFYMKJDMYTFDSSPEAAOOTYSLVC 
DAYC t >L»FN T KI»G:»r •■ VKVOADVGTj GGTVSHEFQLPVD J GEDRLA 
1 C PRCS FS ANME: ■ CLSOMNCPACOG PLTKTKG I EVGKT F YLGT 
K Y S S I FNAQFTNV : X? KPTLAEKGCYGLG VTR Z LAAA1 E VLSTED 
CVRWPS LLAPYQA:. bJ PPXKGSKEQAASEL 1GQLYDH1 TEAVPQ 

lhcevllddrthl:- ignrukdanrfgypfviiagkraledpahf 
evwcqn tgbvafltkdgvmdij.tpvotv 


5580 


1681 


45C 


ADAGTRCIPGFVVi hGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAG E L7-KAV P AAAMGPS ALGQSGPGSMAPW CSVSS 
G PSR Y V LCMQEL F ?. GH SKTREFLAHS AKVHS VAWS CCCR R LASG 

sfdktasvfllek:^lvkennyrghgpsvdqlcwhpsnpdlfvt 

ASGDKT I R I WDV KT i'KC 1 ATVHTKGEN IN 1 CWS PDGQT 3 AVGNK 
DDWTF I DAKTHR5 KAEEQFKFEVNE I SWNNDNNMFFLTttGNGC 
INI LSY PELKPVC-nNAHPSNClCIKFDPKGKYFATGSADALVS 
L WDVDE LVCVRC F5 RLDWPVRTLSFS HDGXMLASASEDHFI D3 A 
EVETGDKLWEVOC:-- S PTFTVAWHPKRPLLAFACDDKDGXYDS SR 
EAGTVXLFGLPND5 


SS81 


54 


94 7 


GGGS G P RAP SAT L i_ TG E S VAA VASG E DKG I AAS AAAAA V F A CS 
CSPDPOSSTMNPVV S PVQ PGAP YGNP KNMAYTGYPTAY PAAAPA 
YNPSLY?TNSPSY;.?EF0F1/HSAYATLLMK0AWP0NS£SCGTEG 
TFHLPVDTGTENP. 7 VOASS AAFRYTAGTP Y KVPPTQSKTAPP PY 
SPSPNP YQTAMYF " R S A Y PQQNL YAQG A Y YTQPVYAAQPHV 1 HH 
TTW0PNSIPSA1 Y PAPVAAPRTj^VAWMVAGTTMAMSAGTLL 
TTPOHTA I GAHP V S M PT YTUVQGTPAY S Y V PPH W 


5582 


5775 


273S 


1 ITlTONimiPLV j »,YHLSGSA0ARGERSPA£RLMER0KRKAD1 
EKGLQF I QSTLPLKQEEYEAFLLKLVQNLFAEGNDLFREKDYKQ 
ALVOYMEGLJWAJDyAASDOVALPRELLCKI-iHVN 
EKALEDSEKALGLEK ESI RAiFRKARALNELGRHKE AYECSSRC 
SIALPKDESVTOLGOELAOKLGLRVRKAYKRPQELETFSLLSNG 

taagvadqgtsng: gsiddi etdcyvpprgspallpstptmpl? 
phvldllapldssftlpstdslddfsdgdvfgpeldtlldslsl 
v0gglsgsgvpse1 ?qhl pvfpggtpllppwggsipvssplfp 
asfglvmdpskxl>--svldald?pgptldpldlbpysetrldal 
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"SEQ 
ID 
NO: 


Predicted 1 

beginning 

nucl cot icf. 

location 

cor re sou: id i ng 

to first. 

amino acic 

residue of 

amino acic 

sequence 


Predicted end 
nucl eot ide 
location 
corresponding 
to tirsi 
aituno acid 
residue of 
amino acid 
sequence 


Amine acid seoner.t containing signal peptide j 
(A---Alani»e, CsCysteine, D--Aspartic Acid, E- j 
Glutamic Acjd, ^Phenylalanine , G=Glycine. 
K*Hist idine, 3= Isnjeucine, K=Lysine, 
1j- Leucine , M-Ke thicnine, N- Asparagine , 
P=?rcline, C = Ci utami r,e , R-Arginine, 
Si Serine, T- 'Ihr eonir.e, V=V«3ine, 
W'-Tryptophan , Y-Tyrcsine, X^Unkncwn, *=Stop j 
Codon, / = pOFsib!ie nucleotide deletion, 
\=possible nucleotide insertion) 








ds r gs7rgs ldk fds fmf.et nsqdhr f psgaokpaps pe pcmpn j 
tall1 knpixaathefkoacobcypktgpragdyiyreglehkck 1 
rdili^rlrssfjdqtwkrirprptktsfvgsyylckdminkcdc ■ 
kygdnctfayhcef i dvl^tfxrkgtlnrdllfdplxsg vxrgslt j 

1AKLLKEHQG1F TFLCE1 CFDSKPRIISKGTXDSPSVCSl'JX-iAAX 
KSFYtWKCl,V}n\ r RSTSLKYSKIROFQEHFOi r DVCRHEVRYGCL 
REDS CK FAHSF 1 ELKVWLLQ0YSGMTHED1VQESKKYWQ0MFAH 
AGKASSSMGAFaTHGPSTFD2,0MKFVCGQCWRNGCWEPDKDLK 
YCSAKARHCWTKERRYLLVWSKAKRKWVSVRPLPS3RNFPQCYD 
LCI JiACNGR KCO YVGNCS FAHSPEERDMWTFMKENK 3 LDMQQTY j 
DMWLKKHNPGKPGEGTPI PSREGEXQICMPTDYADJ MMGYHCKL I 
OGKNSMSKKQWOQHI QS EK.-IKEKVFT3DSDASGWAFRFPMGEFR 
LCDRLOKGKACPDGUKCRCAHGOEELNEWLDRREVLKOKIAKAR 
KDMLbCPRDPDFGKYNFLLQEDGDliAGATPEAPAAAATATTGE 1 


5583 


3 


126h 


ssgcrcgrpgrsdkprpp: j hrkkmvkbtryydilgvkpsaspee . 

IKKAYSKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 1 
QGGEQA3KECGSG5PSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLSVTIjEDLYNGVTKKLAL^KNVI cskcegvggkkgsvekcpl 
CXGRGMHI H 3 00 1 GFGMVQQ1QTVCI ECKGQGERINPKDRCESC 
SGAKV1REKKI1 tVl!VEKGKKDGOKILFHGEGDOEPELEPGDVI 
IVLDQKDHSVFC R RGKDLI M KMK1QLSEALCGFKKT 1 KTbDNR I ! 
LV3TSXAGEV1 KHGULRCVRDEGMPI YKAPLEKG3L3 1QFLVIF 
PEKHWLSLEKLPOl,EALI,P?R0KVRITDDMDQVELKEFCPNEC>N 
WROHREAYEEDEDGPQAGVOCOTA 


SS8« 


3 


126S 


SSGCROGRPGRSDRPRPPPRRHKMVKETRYYDZLGVKPSASPEE 1 
IKKAYRKLALKYHPDKNPDEGEKFKLISOAYEVLSDPKKRDVYD 
QGGEQAI KEGGSGSPSFSSPMD1 FDMFFGX3GGRMARERRGKNW 
H0LSVrLEDLYNGVrKKLAL0KNV3C£KCEGVGGXKG£VEKCPL 
CKGRGKHIH3 00 1 GPGMV00 1 QTVC 1 ECKGQGER 1 NPKDRCESC 
SGAXVI REKKJ I E VHVEKGMKDGQK I LFHGEGDQEPELEPGDVI 
1VLD0KDHSVF0RRGHDLIKKMKI0LSEALCGFKKTIKTLDNRI 
LVI TSKAGEV1 KHGDLRCVRpEGMP I YKAPLEKG I LI IQFLVI F 
PEKHWLSLEKLFOLEALLPPR0KVR3TDDMD0VELKEFCPNE0N 
WRQHREAYEEDEDGPQAGVQCQTA 


559^ 


2619 


915 


LPAGTPESSLHEALDOCMTALDLFLTN0FSEALSYLKPRTKESM 
Y^SLTYATILEMCAMMTFDPODILLAGNMMKEAQMLCORHRRKS 
SVTDSF SSLVKR F TLGQFTEEEIHAEVCYAKCLLORAALTFLQD 
ENMVSFTKGG1KVRMSY0TYKEL.DSLVOSSOYCKG2NHPHFEGG 
WLGVGAFNLTLSMLFTRILRLLEFVGFSGNKDYGLLOLEEGAS 
GHS FRS VLCVMLLLC YHTFLTFVLGTGNVN I EE AE KLLKP Y LNR 
YPKGAI FLFLAG R I E VI KGNI DAA3 RR FEECCEAQQH WKQFHHM 
CYWELM WCFTY KGOW KMS Y F Y ADLLS KENCWSKATYI YMKAAYL 
SMFGKEDHKPFC-DDEVELFRAVPGLKLKIAGKSLPTEKFA1RKS 
RRYFSSNPISLPVPALEKMYIWNGYAV1GK0PKLTDGILEI3TX 
AEEtfLE KGPENE V s VDDECL VKLLKGLCLKYLGR VQE AEEN FR S 
ISANEK K I K YEH Y M PNALLELALLLMEQDRNEEA I KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSKVSSVSL 


5586 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESK 
YHSLTYATI LEMQAMMTFDPQDI LLAGNMMKEAOMLCORHRRKS 
SVTDS FS SLVNR PTLGQFTEEE I KAEVCYAKCLLQRAALTFLQD 
EKMVS F 3 KGG I K V R N S YQTY KE LD S LVQS SQY CKG ENH PH FEGG 
VKT/^vr ZiVNT TL.CMLPTRTT T.FFVGFSGNKDYGLLOLEEGAS 
GHS FRS VLCVML3,LC Y KTFLTFVLGTGNVN I EEAEKLLKPYLNR 
YPKGAI FLFXaAGR 1 EVIKGN 3 DAAI RRFEECCEAQQHWKQFKHM 
CYWELMWCFTYKGOWKMS YF YADLLSKENCWSKATY I YMKAAYL 
SMFGKEDHKPF3DDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRY FSSN P 3 SLPV PALEMWY3 WNGYAV3GKQPKLTDG 3 LEI I TK 
AEEMLEKGPETJEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
3SANEKK 3KYDHY LI PNALLEIALLLMBQDRNEEAIKLLESAKO 
NYKNYS MESRTHF R 3 OAATLOAKS S LENS SR SMVS SVS L 


5587 


1768 


148 


SSAVPDGAVGR P VAVA.VGGPF HS CRCRPCCLMAA I GVKLGCTSA 
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SEC- 
ID 

NO: 


— 37 — r- — = 

beginninc. 
nucleotide 
1 ocat i or 
corresponding 
to first 
ami no acic 
residue o: 
amino acid 
sequence 


Predicted end 
nucleotide, 
location 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Amino acjc segment containing signal pept i de 
(A=Aic nine , C=Cysteine, D=Aspartic Acid, E= 
Glut ar, i c Acid, F- Phenylalanine , G=Glycjne, 
H = Hist i cine, I = ]r-ole\icine, K=Lysine, 
L=beucir.e, M=Metfcionine , N=Asparagi ne , 
P=Prc!3r,e, 0-Glut«mine, R=Arginine, 
S.- = Serme, T-Threcnine , V« Valine: , 
W-Trypl cphan, Y=7yrosine, X=Unknown, *=Stop 
Cccon, /^possible nucleotide deletion, 
\-posFible nucleotide insertion) 








C V A V Y X OG RAG VV ANDAG DRV T P AVV AY S EX E £ 1 VGLAA KQS R 1 
RMISN7VMKVKQI LGR5-" S SDPOAQKYIAESKCLVIEKNGKLRYE 

1 Ui\5Lt i M VJ\ rCU V t\t\ LJ r o rJ live 1 /Uto VLboUnliUv V 1 J vrf 

DFGEKO>0>JA1GEAA^AAGF2JVLRLIHE?SAAI^AYGIGODSP7'G 
* SN1 LV F KLGGT SLS LS VMEVN SGI YRVLSTNTDDNIGG AHFTE 
TtAQYLAEEFQRSFKHDVRGNARAMMKLTKSAEVAKHSbSTLGS 
ANCFLDS LYEGODFDCM^'SRARFELLCSPLFNKCI EAI RGLLD^ 
urcTh-in TMionn rrrct;o T dvt VHT PPIVFI f mct Dnnp 

Nuf 1 Ai-'U J Nft W LLubi>-:Kl K l\ L^Uui Awbr r>* vt JULrlNo -i rrUt 

VI PI GAA J EAG I LI GKF.KLLVEDSLMIECSARD I LVKGVDESGA 

srftvlfpsgtplparrohtloapgsissvclelyesdgknsak 
eetkfaow1odldkkenglrdilavltmkrdgslhvtctd0et 
gkcea:sjeias 


SS8B 






7 PPP PEOAMV AATV AAAWLLLW AAACAOQF.QDf Y D?KAVN 1RGK 
L VSLE K Y RGS VS LWNV AS ECG FTDQH YRAI jOQLQ RDLG PHH FN 

vlafpcnoegooepdsnkeiesfarrtysvsfpmfskiavtgtg 

AHPAFKYLA07SGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQ1TALVRKL3 LbKREEL 


5589 


1884 


552 


LR QAWH EGG 1 GQTDKERGAAAL PGE EGDPTRGR S LGRASW ESGS 
PRRPRSPFSSFLPRPICLSLEARPCSIEDRRJWSHGRPGAPAS 
GXjNRSSGI^LGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 

atbgaagop lgge sics arap ak y s 1 tftgkvisqtafp kqy p lf 
rppaqwssll^aahssdysmwrknqyvsnglj^dfaergeawalm 
keieaageai^qsvhavfsapavpsgtgqtsaelevorrkslvs? 
vvriw^pdwfvgvdsldlcdgdrwreqaaldlypydagtdsg? 
tfsspnf atipodtvteitssspshpansfyyprlakalpplarv 

TLLRLROSPRAFlPPAPVlvPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLCGGHCGKLGTKSRTRYVRVOPANNGSPCPELEEEAECVP 
DNCV 


5590 1 


12 


896 


LCSSGAL,RLl»PAMVAWRSAFLVCLAFSLATL\/QRGSGDFDDFNL 
EDAVKEa-SSVKOPWDHTTTTTTlJRPGTTRAPAKPPGSGLDLADA 
LDDODDG RR K PG 1 GGR ER WNHVTTTTKR PVTTRAPANTXGKDFD 
LADALDDRNDREDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYGSNDDFGSGMVAEPGTIAGVASALAMALIGAVSSYISY 
00KKFCF5 1 0CCLNADYVKGENLEAWCEEPQVKYSTLHTOSAE 
PPPPPEPARI 


5591 


68 


1494 

! 


AGSSRKAAAERrLVSAGCRSLAGRASGVXLLPAELLPGEEEAMA 
LRVTRNS K I NAEKKAKI NMAGAKRVPTAPAATSKPGLR PRTALG 
D J GNKVS EOLQAXMPMKKEAKPS ATGKV1DKKLPKPLEKVPMLV 
PVPVSEFVPEPEPEPErEPVKEEKLSPEPlLVDTASPSPMETSG 
CA PAEEDbCQAFSDVI IA VNDVDAEDGADPNLCSE YVKDI YAYI* 
RQLEEE0AVRPKYLLGREVTGNMRAIL1DWLVQV0MKFRLLQET 
MYMTVSj 1 DR FMCNNCV P KKMLQLVG VTAMF I AS KY EEMY P P E I 
rnFaPVTnMTYTVVriTRDMFMKTl.RAI J4FG1/5RPLPLHFLRRAS 
KlGEVDVEQHTLAKYLMELTMLDYDi4VHFPPSQIAAGAFCLALK 
ll^DNGEWTPTLOHYLSYTEESLLPVMQHLAKNAAMVNOGLTKHM 
TVKNKYATS KHAKI STLPOLNSALVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNOKDbLSALVLTTVNCLPTPI MAXSAEVK1AI FGRAG 
VGKSALWRFLrXRFIWEYDPTLESTYRHQATlDDEVVSMEILD 
TAGQEDT 3 QREGHMRWGEGFVLVYDITDRGSFEETVLPLKNILDE 
IKKPKNVTLIL\H3WKAI)LDHSRQVSrEEGEKIjATEIJlCAFYECS 
ACTGEGK 1 ?EI FYELCREVRRRRMVQGKTRRRSSTTHVKOAIKK 
MLTKISS 


5593 


3 


1122 


HASGGRAANMAAERGAGQOOSQEMMEVDRRVESEESGDEEGKKH 
SSGIVAD LS EC-SLKDGEE RGEEDPEEEKELPVDMET INLDRDAE 
DVDLNHYRIGKI EGFEVbKKVKTLCLRQNLI KCl ENLEELQSLR 
BLDLYDNOI KK1 ENLEAbTELE I LD I S FNLLR N I EG VDXLTR LK 
KLFLVNK KISKI ENLSNbHQLQMLEbGSNR I RAIENI DTLTNLB 
S LFLGKNK ITKLONLDALTNLTVLSMOSNRLTKI EGLQNLVNLR 
ELYbSKNG IEVIEGLENI^KLTMLDIASNRI KKI ENISHbTEbQ 
EFVfl4NDNLLESWSDLDEbKGARSLETVYbERNPb0KDP0YRRKV 
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SKQ 
ID 
.NO: 


Predicted 
begi nninc 
nucl eotidt 
1 oca t ion 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


P-rrdacled end 
noclec: i« 

: jccntior. 

1 corresponding 
to firs: 
omino acid 
residue of 
amino ocid 
sequence 


Ammo be it: irfyment containing signal peptide 
(A=Alar>:me , C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glyc:n« , 
H=Hist:c5ne ( 1= 3 soieucine, K=Lysint . 
L=Leucme, M=Me t hioni ne , N = Asparagmc , 
P-Proline, Q^Glutamine, R»Arginine, 
S=Scrine, T=Threonine, V^Valine, 
W= Tryptophan, Y^Tyrosine, X=Unknown, *^Stop 
Codori, /=possiblc nucleotide deletion, 
\=possibje nucleotide insertion) 






MLALPSVROI DATFVRF 


5594 


3 


111? 

i 


HASGGRAJ^SlAAEKGAGQOOSOEMMEVDKRVESEESGDl-EGKKH 
SSCIVADLS^QSLKDGEERGELDPEEEHELPVDMETlNIiDRDAE 
DVDLNHY R ] G K I EGFEVLKKVKTLCLRC'NLl KC I ENLEELQSLR 
BH)L*YDNQ3 KKI ENLEALTELE3 LDI SFNLLRNI EGVDKLTRLK 
KLFbVNNK 1 S K I SNLSNLHQLQMLELGSNR IRA I EN I DTLTNLE 
SLFLGKNK 1 T K1,0NLDALTNLTVLSMQSNRLTK I EGIjQNLVNLR 
ELYLSHNGIKVIEGLEKNNKLTMLDIASNRIKKIHNISHLTELQ 
E FWNNDNL.LE S WS DLDE LKG AR SLETVYLERN PLO KD PQY RR KV 
MLALPSVRQ) DATFVRF 


S59S 


3 


14 7 6 


ARWNGRKVQVPAWPGPGCGTNASGERQRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVS0PTPTLGTMFADLDYD1EEDK 
LG 1 PTVPGKV TLQKDAQNL I G I S I GGGACj YCPCLY I VQ VFDNTP 
AALDGTVAAGDEI1X5VNGRS3KGKTKVEVAKMIOEVKGEVTIHY 
NKL0ADPK0GMSLUIVLKKVKHRLVENMSSGTADALGLSRA1LC 
KTJGLVKRL.EELERTAELYKGVTTEHTKNLLRAFYELSOTKRAFGD 
VFSV 3 GVREPQPAASEAFV.KFADAHRS IEKFG I RLLKT3 K PMLT 
DLNTYLNKA3 PDTRLTI KKYLDVKFEYTiSYCLKVKEMDDEEYSC 
IALGEPl»yRVSTGNYEYRliI LRCRQEARARFSOMRKDVLEKMSL 
LBQKHVOD I V FOLCRLVSTMS K Y YNDC YAVLR DADV FP 3 E V DLA 
HTTLAYGlxNUEFFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


559S 




21S> 


GAVLA PSS LPAAELAAQGES QSLEDLSNTSRPTSE VYK 3 S FI FP 
NGDK YDCDCTRTS SG I YERNGI G IHTTPNG3 VYTGS KKDDKf'iNG 
FGRLEHFSGAVYEGQFKDNM F HGLGTYTFPNGAKYTGN FN ENRV 
KGEGEYTHICGTKMDVVTFHFTSCSQT 


5597 


3 




I SCKMA7U3GCE SLPASWRSVTLTHVEYPAGDLSGHLIAYLSl^SP 
VFVI VGFVTL'J 3 FKRELHTI SFLGGLALNEGVNWLI KNV1QEPR 
PCGGPHTAVGTKYGMPSSHSOFMWFFSVYSFLFLYLRMHOTNNA 
R FLDL LW R } IV US LGL LA VAFLV SYS R VY LLYHTWS Q VLYGG 3 AG 
GLMA3AWPIFT0EVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARRRC-RKLGTKLQ 


5598 


32 C 


2440 


G1GP3AAS Fl POH/ASLYIFLSPPPPSVSGVPYSPANSSWSCAb 
VPLLGSGVPPi;PPAPSPCC5G0TMLKMbSFKLLLLAVALGFFF.G 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSOLELLSGG 
EMLCGGFY PRLSCCLRSDSPGLGRLENKI FSVTNNTECGKLLEE 
IKCALCSPHSOSLFHSPEREVLERDLVLPLLCKDYCKBFFYTCR 
G H I PG FLQTT AD£ F CFY Y AR KDGGLC F PDF PR KQ VR G P AS NYLD 
QMEEVDKVEEISRKHKHNCFCIQEVVSGLROPVGALHSGDGSQR 
LF3LEKEG YVK 3 LTPEGEI FKEPYLD3HKLVOSGI KGGDERGLL 
SLAFH PN Y K KNG K LYVSYTTNQERWAl GPHDH I IRWE YTVSR K 
NPHQVDLR TAR VFLEVAELHRKHLGGQLLFGPDGFLY 1 1 LGDGM 
I TLDDME E M DG 1>S DFTGS VLR LD VDTDMCNVPYS 3 PRSNPWFNS 
TNQPPEVFAHGLKDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQI 1 KGKDYESEPSLLEFXPFSNGPLVGGFVYRGCQSERb 
YGSYVFGDR-VGTATbTLOQSPVTKOWOEKPLCbGTSGSCRGYFSG 
HlbGFGEDELGEVYIbSSSKSMTQTHNGKbYKIVDPKRPLMPEE 
CRAWOPAOTbTSECSRbCRNGYCTPTGKCCCSPGWEGDFCRTG 


S599 


326 

i 


2440 


GIGPlAASrTFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAb 
VPbbGSG V P PU F PAPS PCCSGCTMLKMLS FXLLLLAVALGFFEG 
DAJ<FGEPJ*EGSGARRRRCLNCNPPKRLKRRDRPJWSQLELLSGG 
EMLCGGFY PRLSCCLRSDSPGLGRLENKI FSVTNNTECGKLLEE 
1KCALCSPHSOSLFHSPEREVLERDLVLPLLCKDYCXEFFYTCR 
GH I PGFLOTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFC3QEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSG2KGGDSRGLL 
SLAFHPNYKKNGKLYVSYTTNOERWAIGPHDHILRWEYTVSRK 
N PHOVTJtRTAR VFLEVAELHRKHLGGQLLFGPDGFLY 1 1 LGDGH 
I TLDDMEEMDGLSDFTGS VLR LD VDTDMCWP YS I P R SNPHFNS 
TNQPPEVFAKGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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SEQ 
2D 
NO: 


Piea. t.tec 
beg i r.ninc 
nucitotice 
locat ion 
corre < ponding 
to J;.rst 
amine acic 
residue cf 
amine acid 
sequcr.ee 


Predicted end 
nucieoL ide 
Docat. ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo 6c:ci segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glut ami c Acid, F = phenylalan; ne , G^Glycine, 
H=Hi sti dine, 1=1 soleucine , K=Lysine, 
L=Leucine, M=Met hionane , N=-Asparagine , 
P=Prohne, Q=G)ut amine, R^Arginine, 
S«Serine, T=Thiecnine, V=Va2ine. 
W=Try?t ophan. Y-Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ = poss±bie nucleotide insertion) 








i?AR 1 _,n_L J KL»KDyt:Sfc.PSijLEFKPr SNGrLVGGFVYRGCQSERL* 
YGSyVFGDRNGNFLTLOOSFVTKQWOEKPLCLGTSGSCRGYFSG 
Kll.G FGELEL.GEVYILSSSKSMTQTHN 1 GKLYKIVDPKRPLMPEE 
GR7iTV0PAOTLTSECSRLCRNGYCT?TGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMOTRDLVOPDKPASPKFlVTLDGVPSPPGYMSnQE 
EDMCFSGKKPVNQTAASNKGI.RGLLHPOOLHLLSROLEDPNGSF 
SNAEM.SEL^VAOKPEKLLERCKYWPACKTCGDECAYUHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTH\ 7 SRRIPVLSPKP 
AVA PPAPFSSS QI,CR Y FF ACKKMECPF YHPKH CR FNTQCTR PDC 
TFYHPTI NVPPRHALKWJRPQTSE 


5601 


3 977 


12 44 


SLKVLSGKLMQ7RDLV0PDKPASPXFI VTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPOQLKLLSROLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKMGDECAYHHPISPCKA 
FFNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVliS PXP 
AVAPPAP PSS SQI>CRY FPACKKKECPF YHFKHCRFNTQCTRPDC 
TFYHPTI NVPPRHALKW1RPQTSE 


■ 5602 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRKAEMVA IDQVLDWCRQSGKS PSE V FEHTVLY VTVE PC 1 MC 
AAAL R LM KIP L WYGCQN E R FGGCGS V I AS ADLPNTGR P FQC 
I PGYRAEEAVEMLXTFYKOENPNAPKSK-^RKKECQOI LNMF 


5603 


1 


S65 


FRGRTPI SG^FRGCAOYPl PATPARSGENRTMPGAGDGGKAPAR 
WLGTGLLGLF1>LPVTLSI,EVSVGKATP1 YAVNGTEILLPCTFSS 
CFGFEDLHFRWTYTJSSDAFKI LIEGTVKNEKSDPKVTLKDDDRI 
TLVG STKE KRNN 1 S I VLRDLE FSDTGK Y TCHVKN P XENNLQHHA 
TlKLO/WDRRMQ 


5604 


3 


:506 


EDIPPAOLLKUJRHERVWOQEPPVRDHRSWGGSGAGGVAGREWT 
D0GQVA1X5GH YMAEGEGYFAMSEDELACS PYI PLGGDFGGGDFG 
GGDFGGGDFGGGDFGGGGSFGGHCLDyCESPTAHCNVLNWEQVQ 
RLDG1LSETI PIHGRGNFPTLELQPSLH VXWRRRLAEKRIGVR 
DVRI JYGS AAP HVLHQDSGLGYKDLDL1 FCADLRGEGEFQTVKDV 
VLDCI,l.DFLPEGVNKEKITPLTbKEAYVCKMVKVCNDSDRWSLI 
SL3NN5GKNVELKFVDSLRROFEFSVDSFOIKIJDSLLbFYECSE 
NPMTETFHFTI IGESVYGDFQEAFDHLCNXI I ATRNPEEJRGGG 
LLKYCI^LV^GFRPASDEIKTJLORYMCKRFFIDFSDIGEQORKL 
ESYLQNH FVG I ,EDRKYEYLMTLHGWNZSTVCLMGHERRQTLNL 
1 TMLJUR VLADQNV1 PNVANVTCYYQPAPYVABANFSNYYIAQV 
O P VFTCQCQT YSTWLP CN 


5605 




1621 


SORSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPAIjGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALKSLRRypLPLRSGKEAKILQHFGEGbCRMLDERLORHRTSG 
GDHAPDSPSGENSPAPQGRLAEVODSSMPVPAOPKAGGSGSYWP 
ARMS GAR VjLLVLYREHLNPNGHHFLTKEELLORCAOKSPRVAP 
GSAR PWPALRSLLHRNLVl.RTHOPARYSLTPEGLELAQKIiAESE 
GLSLLNVG 3G PKEPPGEETAVPGAASAELASEAGVQOOPIjEbRP 
GEYR VLLCVD I GETRGGGHRPELLRELC'RLHVTHTVRKIjHVGDF 
VWVACETNPR DPANPGELVLDHI VERKRLDDLCSS 1 1 DGRFREQ 
K FRL XROGLERRVYLVEEHGS VHNLSLFESTLLOAVTNTQVI DG 
FFVKRTAD1 X ESAAYLALLTRGLQRLYCGHTLRSRPWGTPGNPE 
SGW/JT5PNPbCSLLTFSDFNAGAIKNKAOSVREVFARQLM0VRG 
VSGE XAAAXVDRYSTPASLLAAYDACATF XEQETLLST2 KCGRL 

riTJWl CPTTJ^HT VPQVnPT T 
KJ Af> 1_» v» rHLi O n. 1 JjO ^i-> JV.OI Urlil 


S606 




1099 


GR SR C?G PGARGGTMS PRS CLRS LRLLVr AVFS AAASNWLYLAK 
L5SVGS 1 SEE ETCEKLKGLIORQVQMCXKKLEVMDS VRRGAQIA 
lEECOVOFRN'RRWNCSTLDSLPVFGKWTOGTREAAFVYAISSA 
GVAFAVTRACSSGELEKCGCDRTVHGVSPGGFOWSGCSDNIAYG 
VAFSCSFVTJVRERSXGASSSRALMNLHMNEAGRKAILTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFROVGHALKEKFDGATEVEPRRVG 
SSPALVPRNAQFKPHTDEDLVYLEPSPDFCEODMRSGVIX5TRGR 
7 CNKTS XA I DGCETLLrCCGRGFHTAQVE^iERCS CXFH WCC FVXC 
RCGQR1.VELHTCR 
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SEQ | 

ID 1 
NO: j 

! 


Pi edi ctec 
bcci nni nc 
r.uci eotict 
I oca ti or. 
ccrrespcncino 
to f;rst 
amino acic: 
residue cf 
amino acic 
3cqi:ence 


PredicLed end 
nucleotide 
location 
cvx responding 
to first 
amino arid 
iej-idue cf 
amino acid 
sequence 


Ajt > nc ac;c seoment containing tiqnsl peptide 
;;'-Aian:nc, C-Cysteine, C-y-.^part ic Acid, Ex 
G'.-jLamic Acici, F- Phenyls la> r ".i ne , G=Glycine, 
H-Hist idine, 1= Isoleucine. K=Lysine, 
L=..'eucine, M=Nethionine , NsAsparaginc , 
Psproline, Q=-Glutamine , R-Arginine, 
Srferinc, T= Threonine , V=Vnl :.ne, 
VJ^Trypt. ophan, Y=Tyrosine, X- Unknown, * = Stop 
Cccon, / = possib!e nucleotide deletion, 
Vrossibie nucleotide insertion) 


S607 J 


52: 


141 j PPVCNPAE/iJ4PSPGTVCSLl>l,b(;tvil.WLDLAf4AGSSFLSPEH0RV 
1 00 J< K ES K K P PAKI ,QPRALAGWLR P RDGC-OAEGAEDELE VRFNAP 
' FDVGIKLSGVOYQQHS0ALGKFLQDI LWEEAKEAPADK 


s6oe 




983 j 


WK/SPLROADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R 3 C T EPK Y TG I WH CVF DT Y HR ER VWG FY R G hhh P VCT VS LVS S E 
VF^TYRHCLAlUCRLRFGNPDAKPTKAr. 1 TLSGCASGLVRVFLT 
SPTEVAKVRLOTQT0AQKO0RRLSASGP]aAVPPMCPVPPACPEP 
KY^GPLHCLATVAREEGLCGLYKGSSAIVLPDGHSFATYFLSYA 
VLCE WLSPAGHSRPDVPGVLVAGGCAGVl^WAVATPMDVIKSRL 
0AJ)G0GORRYRGl>LHCMVTIVREF.G?RVl,FKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGLLT 


S609 


1626 


304 


AK0VWV1,PSPPPRPGRGALVSGSGLRRGKSG7SWRPKRMNHKSK 
KR : R EAKR S AR PELKDS LDWTRHN Y Y ES FS LS PAA VADMVERAD 
ALOLSVEF.FVERYERPYKPWLLNAOb-GWSAOEKWTLERLXRKY 
KN0K?KCGKDNDGYSVKMKMKYY3FYMF£TRDDSPLYIFDSSYG 
EKPKRRKTbEDYKVPKFFTDDLFOYAGEKRRPPYRWFVMGPPRS 

gtg: kidplgtsawnalvoghxrwcl^ptstpreli KVTRDEGG 

N00DEAITWFNVIYFRT0LPTWPPEFKPLE1L0KPGETVFVPGG 
WWilV^/LNLOTTlAlTQNFASSTNFPV^HKTVRGRPWbSRKWYR 
ILKQEHPELAVliADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTWRRKKRRTCSMVGNGDTTSUDDCVSKERSSS 
R 


r>6io 

j 


54 


119ft 


LEP'i'PASADMAWTKYOLFLAGIiMbVJG.S J K TLS A KWADN FMAEG 
CGGSKEHSFOHPFLOAVGMFLGEFKCK^AFYLLRCRAAGOSDSS 
VDPQOPFNPLLFLPPALCDMTGTSLMYVALNMTSASSFOMLRGA 
VI 3 rTGLFSVAFLGRRl>VJL.SOWLGI LATI ACLVWGLADLLSKH 
DSOH KLS EV I TGDLLI 3 MAQI I VA 1 0M V LE E KFVY KHNVH PLRA 
VGTL'GLr GFVI LSL»LLVPMYYIPAGSFSGNFRGTLEDAbDAFCQ 
VG0OPLI AVALLGNI SS 1 AFFNFAG 3 SVTKELSATTRKVLDSLR 
TVV J WAl>SLALGWEAFHALQILGFL3 LLI GTALYNGLHRPLLGR 
LSRGRPIjAEESEQERLLGGTRTPlNDAf 


i 


7 


571 


FV L P MR L»G I PG ST FRG PG AC AS S Sf. LA A < *> A K PG AGG S P ALAMSG 
EL£NRFOGGKAFGDLKAR0ERRbAE3KREFLCDOKYSDEENLPE 
KLTA FKEKYME FDLNNEGEI DLMSLKRMKEKLGVPKTHLEMKKM 
1SEVTGGVSDTISYRDFVNMMLGKKSAV1.KLVMMFEGKANESSP 
KPVGPPPERDIASIiP 


5612 


1 


721 


askdgymdatjaphrippempqygefnki felkqamwlckhlns 

S LL1 LEN L 1 LN E FS YTATEARRLYLOR K T V PS ALLV QL I QERLA 
EEDC3KQGW3LDCIPETREQALRIQTLG3TPRHVIVLSAPDTVL 
3 EPJv LGKRIDPQTGEI YHTTFDWPFES El ONRLMVPEDISEliET 
AQK LI >EY HRNI VRV I PS Y PKI LKV 3 5 ADO PCVDVFYQALTYVQS 
NHRTNAPFTPRVLJbbGPVGS 


5613 


115 


1279 


RGVLPALRRAEKML.PLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRKLFFFLCLNLSFAFVELLYGIWSNCLGL3SDSFHMFFDST 
AI LAGLAASV1 SKWRDNDAFSYGYVRAEVLAGFVNGLFLIPTAF 
FI FS EGVE RALAP PD VHH ER LLL VS I LG FWNL 1 G I FV FKKGGH 
GHSHC-SGHGHSHSLFNGALDOAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDX3PSLKETTGPSRQIL0GVFLH1LADTLGSIGVI 
ASAlKM0NFGLWlADPICSILIAlLIV\'SVIPI>LRESVGII4iQR 
TPP L LBNS LPOCYQR VQQIjOG V YS LOEQH FWTLCSDVYVGTLKL 
I VAPDADARWI LSQTHNI FTQAG VRQL YVQ I DFAAM 


5614 


3 


1268 


LLSRJ»JEHACPLCAG1X5LTCRKPKAI RG REGPJVTNQGQGETQNER 
APWGARORLGVmELQOLQEFEIPTGREALRGNHSALLRVADYC 
EDNYVOATDKR KALEE TMAFTTQALASVA Y QVGNLAGHTLRMLD 
LOGAALRQVE ARVS T1/%MVNMHMEKVPJ< R E I GTLATVQRLPPG 
0KV3 APENLPPLTPYCRRPLNFGCLDDIGHGIKDLSTOLSRTGT 
LSR KS 1 KAPATPASATLGRPPK I PEPVHLP WPDGRLSAASSAS 
SLAS AGSAEG VGGAPTPKGOAAP PAPPLPS S LDPPPP PAAVE VF 
ORPPTLEELSPPPPDEELPLPLDLPPPFPLDGDELGLPPPPPCF 
GPDEPSHVPAS YLEKWTLYPYTSOKDNELS FSEGTV I CVTRRY 
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SEC 
ID 
NO: 


Predi c ted 
bega nninc 
nucleotide 
1 ocation 
cor re spending 
to first 
anuno acid 
residue of 
amino acid 
sequence 


Predicted er.c 
nuclect ide 
1 ocn t i on 
corresponds u9 
to first 
amino acid 
residue of 
amino acid 
sequence 


Atniric acid LccTipnt cont u 3 ni no sional Peptide 
(A=Alanine , C-Cys teine , D-Aspartic Acid, E = 
Cl-jtomic Acid, F-Fheny3 alanine, G = Glycir.e , 
K«-Hi stidine , I-Iscleucine, K=Lysine, 
L=Leucine, M=Met hlonme , N= Asparsgi ne , 
P^Prcline, Q = G} utantine, R=Arginine, 
S-serine, T=Threomne , V=Valine, 
W=Trypt ophan, Y=Tyrosine, X^Unknown, *-Gtop 
Codon, /=possible nucleotide deletion, 
\*cossible nucleotide insertion) 








SDGWCEGVSSEC-TGFFPGNYVEPSC 


S61S 


9 


15S8 


AbGRRRPGDPREMr.AAATPAAAGAAKREELDMDVMRPLI NEONF 

LLG1.P1A3KNAG1VLGPISLVF1G11SVHCMH1LVKCSHFLCLR 
FKKSTLGYSDTVSF/^L'VSPWSCLOKOAAWGRSWDFFI.VITOL 
GFCSVYlVFLAENVKOVliEGFLESKVFlSNSTNSSNPCERRSVD 
X.RI YMLCFLPF1 1 LLVFlRELKKLf-VLSFLANVSMAVSLVl 3 YQ 
yWRNKPDPHPTLPJVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 

liPODVWL YQSVK 1 1 ,Y S r G I FVTYS 1 OFYVPAET 1 1 PG3 TCKFHT 
KWKQTCEFGIRSFLVSl TCAGA1 LI PRLD3 VISFVGAVSSSTLA 
L1LPFLVEILTFSKEHYNIWMVLKNIS1AFTGWGFLLGTYITV 
EEI I YPTPKVVAGTPQS PFLNLNSTCLTSGLK 


5616 


1 


"719 


DDr VKkT^F(JSAA^KJHSAKubKAV^ VboK J i THFE 
LKHLSSGDLLRDNMLRGTLIGV1J\KAFID0GKLIPDDV>)TRLAL 
HELKNLTOYSWLLDGFPRTLPOAEALDRAYOIDTVINLN^/PFEV 
1 KQRLTARW I HPAS GR VYKI EFNP P KTVG I DDLTGE PL 1 OREDD 
KPETV1 KRLKAYEDQTKPVLEYYOKKGVLETFSGTETNKI WPYV 
YAFLQTKVPQRSOKASVTP 




176 


•76S 


PWRGRG^RPRGAGAKAEF.OVNRSAGLAPDCEASATAETTVSSVG 
TCEAAGKSPEPKDYDSTCVFCR3AGR0DPGTELLHCENEDLICF 
KDJKPAATHHYLWPKCTnGNCRTLKKDQVELVENMVl'VGKTIL 
F RNN FTD FTNVRM6 FHM P P FCS 1 SH LHLHV 1 J\ PVDQLC FLS KLV 
YRVNSYWFITADHL1 EKLRT 


5618 


3 


1692 


YLNYIKLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKS I RLLS E I EKLVGTSVPGLLE 1 1 LSSS 1LE1 YN 
H I LOT WPDEDVTFR KS CATKRKLS N I NQEEASGTS LH-QKA I MT 
FTCHNEINAFWL5RGS0ILSLKSTRFLTKLGHCSSACPSDSVS 
CTNIONIjKGLNSPVAJGKSKDPSCVAXVSEEGKPAIGTOKMELH 
VRWR S DTG KCVDAS PLW I PTFDK S STTVY 3 GSH S HRM KAVDF Y 
SGKVKWEQ1 LGDR I ESSACVSKCGNFl VVGCYNGLVYVLKSNSG 
EKYVJMFTTEDAVKSSATMDPTTGL2 YIGSHD0HAYALD1 YRKXC 

I h.' i/(> r/ /'/"■"TtmO C n^i tH ~ DLTITT vrfil"! f~°"~~1 T T 7» \fkt O UT( 'Ml f T t»I 

VVvKo KCbt-* I Vh i>b PCLNLi- rHHi^i r Al 1»<»oL»L»JUA VN W8i V 1 V* 
KHSCGKPLKSSPOCCS0Y1C1GCVDGN-.LCFTHFGEQVWQFSTS 
GP1FSSPCTSPSE0K1FFCSHDCFI YCCNMKGHLQWKFETTSRV 
YATPFAFHNYNGSNEMLLAAASTOGKVWILESOSGQL0SVYELP 
GEVFSSPWLESML1 1 GCRDNYVY CLDLLGGNQX 


5619 


2160 


147*/ 


biS PVLPTSGNV I STAQPAQPWSAVEAALRSLGSP PGAGRGCPCP 

RSCPOPRPt.EEl^LRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGSPSSPGCCLHPPAOHSODLPLVHVDVGWQPPLGP 
TVGLRPGLLGERORCALRAGDPOCOCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


930 


1B2 


PLPFPTLAMFLTRSEYDRGVNTFSFEGRLF0VEYA3 EAI KLGST 
AIGlOTSEGVCIoAVEKRITSPLMEPSSIEKIVEIDAHlGCAMSG 
L I ADAKTL I DKARV ETCNliWFTYiraTMTVESVTOAVSNLALQFG 
E EDAD PGAMSRP FG VALL FGGVDEKGPQ1>?HMDP£GTFV0CDAR 
AI GSASEGAQSSL0EVYHKSMTLKEA1 KSSLI ILKQVMEEKLNA 
TNI ELATVQPGQNFKMFTKEELEEVI KD1 


S621 


3 


819 


W E FV E Y TATDAN V KN ES LS S VQQLG I KMT VR YG KFLS LLKDG A 
ENDLTWVLKHCERFLK0QOTSIKSSLLCLQGNYAGHDWFVSSLF 
MIMLGDKEKTFQFLHOFSRLLTSAFLWLPRLHISSYLPNDTVES 
G1HPVYFCSTHYIEKLLKAELPLVFSAFHMSGFAPSQ1CLQWIT 
0 C FWN YLD W I E I CKY I ATCVFLG PD YQVY I C I A VFKHLQOD I LQ 
HTQTODLQ V FXKEE ALHGFR VS DY FE YM E 1 LEON YRTVLL-RDMR 
N I ROOST 


5622 


1122 


4S6 


AASTKDAVsRKRSKSASEKSGTGTSISKRLNKiNpQ^ 
PGTFYFOFKNLWEANDRNETWLCFTVEG 1 KRRSWS WKTGVFRN 
QVDSETHCHAERCFbSWFCDOlLSPMTKYQVTWYTSMSPCPDCA 
G E VAEFLARHSiWNLT I FTARL Y Y ?QY P CYQEGLRS LSOEG VAV 
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SEC 
It' 

NO: 


Predicted 
beg inn; ng 
nucleotide 
location 
corresponding 
to firsu 
amino acid 
residue of 
amino acid 
sequence 


Predict end 
nuci cot i ct 
location 
corresponding 
to first 
amino acio 
residue of 
amino acid 
sequence 


Amino ecid segment ccnlciamg signal peptide } 
(A.Alanine, C=Cypteine, D=Aspartic Acid, E : j 
Glutamic Acid, F =■ Phenyl a 1 a nine, G=Glycinfc. 
H-Histichne, 1-1 soleucJne, K=I.ysine, 
L=l.eucine, ^Methionine, f.'s Asparacjine , 
P=Prcline, Q=Glut,'.mine . R-Arcinine, 
S=Serine, TV Threonine , v=Valine, ! 
W=Tryptophan, Y-1Vrosine , X^Unknown, *=Stcp 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








EI MDY EDFKY CWENFVYNDNEP F K PWKGLKTN F2 LLKR RLRES L 
Q 


b62? 


3 




FLP F F I RAP K I SRNGQ WL FTF'J TP} P FANKALPGWEG I V P ACFW 
RKKI bTPSTGTMELLQVTILFLLPS I CSSNSTGVLEAANNSLW 
TT7KPSITTPNTESL0K3TWTP7TGTTPKG71TNEbLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDS J : SNVTVTSVTLPNAVSTLOS 
SK?KT2TCSSIKPTEIPGSVbOPDASPSKTGTLTSIPVTIPENT 
SCS0VIGTEGGKNA5TSATSRSY5SIILPWIALIVITLSVFVL 
VGLYRMCWXADPGTPENGHDOPCFDKESVKLLTVKTTSHESGEH 
SAOGKTKN 


f>62 4 


IbS 


B9fc 


FGVAAAAGALPOYHGPAPALVSCR P ELS I .SAGSLQLER KRRDFT 
SSCSRKLYPDTHALVCLLEDNGFATQOAEI 3 VSAbVKl LEANMD 
IVYKDMVTKMOCEITFO0VMSOI ANVKKDK1 1 LEKS EFS ALRAE | 
NEK 3 KLEIjHOLKOQVMDEV: iCVKT?TK"->DFKLEKSKVKELYSLN 
EKKLbELRTEIVALHAQODRALTCTDKMETEVAGLKTMLESHK 
LDN2 K.YLAGSI FTCLTVALGF YRLWI 


S62S 


1 


a: ho 


TI PSS AAACRAG P P AGALE ALS PGC ARAHAERRGEMRATP LAAP 
AGSLSRKKRLELDDNLDTERPVOKR/JISGPOPRLPPCLLPLSPP 
TAPDRATAVATASR IG F YVLLS P EE GG RAYQALHCPTGTE YTCR 
VY PVQ EAUAVbEF YARLPF H KHVAP PTE VLAGTQLLYAFFTR TH 
GDMHSLVRSRHRlPEPEAAVLFROt'ATALAHCHQHGLVLRDLKL 
CRFVFADR ERKKliVLENLEDSCVLTGPDDSLKDKHACPAYVGPE 
IhS S R ASY SGK AAD VWS LGVAL FTK 1 ,AG H Y PFQDSEPVLL FGKI 
RRGAYALPAGLSAPARCLVRCLLRMTPAERLTATGILLHPWLRQ 
DPKPI^PTRSKLWEAAQWPDGLGLDFAREEEGDREVVLYG 


5626 


3123 


20 1 J 


PPRALGSVAMENQVLTPKVYWA0RHRELYLRVELSDV0NPA1SI 
TEKVLHFKACGHGAKGDNVYEFHLEFLDLVKPEPVYKLTCROVN 
1 TVQK KV5QWWERLTKQEKRP Lr I .AF D FDRWLDESDAEMELRAK 
EEERLWKbRLESEGSPETLTNLRKGYLFMYNLVOFLGFSWIFVN 
LTVR FC I LGKES F YDTFHT VADMM Y FCOMLAWETI NAAIGVTT 
S PVL PS LI QbbGRN FI L>F1 1 FGTMEF KQNKAVVFFVF YLWS A I E 
I FRYS FYMLTC 3 DMDWKVLTWLR YTLW 1 PLYPLGCLAEAVS VI 0 
SI Pi FNETCRFSFTLPYPVXl KVRFSKFLQ1 YL3M1 FbGLYl NF 
RHLYKQRRPIRYGQKKKKJH 




3123 


2ci; 


PPRALGSVAMEKOVLTPHTVYWAORRRELYLRVELSDVONPAI S 1 
TENVLHFKACGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQROVN 
ITVOKKVSQWWEKbTKQEKRPLFIAPCFDRWLDESDAEMELRAK 
EEERLNXLRLESEGSPETLTNLRXGYLFMYNLVQFLGFSWI FVN 
LTVR FC 1 LGKES FY DTFH7VADMMY FCQMLAWET I NAA I GVT7 
SPVLPSLIOLLGRWFILF1IFGTMEEM0NKAWFFVFYL.WSA3E 
IFRYSFYMJLTCIDMDWKVuTWLRYTLWIPLYPLGCLAEAVSVIQ 
S I ? 1 FNETGRFS FTLPY PVKI KVRFS FFLQI YLIMI FLGLY INF 
RHLYKCRRRRYGCKKKKIH 


S626 


75 




VAGAMASKCliKAGFSSGSLKSPGGASGC-STRVSAMYSSSPCKLP 

SLSPVARSFSACSVGLGRSSYRATSCLFALCLPAGGFATSYSGG 

GGWFGEGI LTGNEKETMOS LNDRLAGYLEKVROLEQENASLESR 

IREWCEQQVPYMCPDY0SYFRT1EEL0KKTLCSKAENARLWE1 

DNAKLAADDFRTKY'ETEVSLRQLVESDIKGLRRILDDLTLCKSD 

LEAOVESLKXEbLCLKKNHEEEVKS^RCQLGmi^EVDAAPPV 

DLNRvLittMRCyjfcT l»Vi.NNKKIJ/\bUWL»Ui ybLt»LJvtKVVVi>oi>fc 

QLQSCQAEIIELRRTVNALEIELOAOHSKRDALESTLAETEARY 

S SQbAQKQCM I TNVEAQLAE I RAPLE RQNQE YQVLLDVRARLEC > 

EINTYRGLLESEDSKLPCNPCAPDYspsKSCLPCLPAASOGPSA 

ARTNCSARPICVPCPGGRF 


^629 


2287 


938 


GRPRSSSDNRNFLRERAGLSSAAVOTRIGNSAASRRSPAARPPV 
PAPPAbPRGRPGTEGSTSLSAPAVLVVAVAVVVVWSAVAWAMA 
NYI HVP PC-S PE VPKI/NVTVODQEEHR CREGALSLLOHLR PH WDP 
QEVTLCLFTDGI TNKLI GCYVGNTMEDVVLVRI YGNKTELLVDR 
DEEVKS FRV LQAHGCAPOLYCTFNNGLCY EFICGEALDPKHVCN 
PA I FRL I AROLAXI HAI HAHNGWI PKSNLWLKMGK YFSL I PTGF 



353 



BNSDOCID: <WO 01S331. -1J > 



WO 01/53312 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleot iat 
location 

to first 
amino acid 
residue ot 
amino acid 
sequence 


Prtdicted end 
nucj eotide 
location 
corresponding 

ammo acid 
residue of 
amino acid 
secuence 


Arr.mo acid sequent containing signal peptide" 
',A=Alanane, C=Cysteine, D=Aspa:rtic Acad. E= 
G2utamic Acid, F= Phenyl a 1 an i no , G=Giycinc, 
H=Histidme, ) - i soieucine, K^Lysine, 
Jj=Leucinc M c Kf- 1 hi oni ne , N=Asparacjint. 
P-Proline, O-Gi xit amine , R=Arcinine., 
S^Serine, T=Tnr coninr , V=-Voline, 

Tryptophan, V ^Tyrosine, X=Unknown, *=Stop 
Cooon, /^possib'-e nucleotide deletion, 
^possible nucleotide insertion) 








ADEDI N'KK i-'LSDIPSSCl LQEEMTWMKElLSNLGSPVVLCHNDlT" 
LCKMI iyKEKOGDVC- :? lC , y-YSGYNYL 1 AYDlGNHFTJErAGV£DV 
DYSLYPDRELQSQWLKAYLEAYKEFXGFGTEVTEKEVEILFIQV 
NOFALASHFFMGLWAL10AKYST1EFDFLGYAIVRFNQYFK14KP 
EV7ALKVPE 


5630 


1194 


278 


G FWA I AQTCAHH LF PC S F WLVPAS PWR LPEMS S FG YRTLT VAL F 
TLICCPGSDFKVFEVPA'RPKKLAVEPKGSLEVNCSTTCivlOPEVG 

NS NVS VYO P PRQVI I.TLO PTLVAVGKS FTI ECRVPTVE PLDSLT 
L F l.FRGXE TLH YET FG KAA PA PCjEATA TFNS TADR E DG H RN FS C 
LA VLDLMS R GGN I FK K.H S AP KMLE I Y E P VS DS QMV I J V T WS VL 
LSLF^TSVLLCFIFGCKLRQQRMGTYGVRAAWRRLPQAFRP 


5631 


1053 


290 


SR VDD FVR P EPS RAE PS R SGRKR P AR RAATOS V FG KJLFG AGGGK 
AGKGGPTPOEAIQRLKDTEEMLSKK0EFLEKKIE0ELTAAKKHG 
TKMtJlAALOALKRKKK YEKQ^AQI DGTLSTI EFQREALEN/iNTN 
TE YLKNhG Y AAKAMK7 iAH DNMD1 DXVDELhOQD I ADQ QE LA E E 1 S 
TA IS KPVG FGEE FDEUELMAE LEE LEQEELDKNLLE I SG PETVP 
LPNV PS I AJbPSK PAK K K EE EDDDN KELENWAG S M 


5632 


3 


7TF~Z 

95* 


WLGWS PPR RLWWGS U3AAQR PAVPVSGLARSLHVKTK R PH R RA 
SVRVARGRLGVWAQPC'PLLPRPVGSRREMOPPGPPPAYAPTNGD 
FTFVSSADAEDLSCS 1 ASPDVKLNLGGDFIKESTATTF1.RQKGY 
GWLLEVEDDDpEDNKri LEELDTDLKDI YYK1RCVLMPMPSLGF 
NRCWRDNPDFWGPLAWLFFSMISLYGQFRWSWI ITIK3 FGS 
LT 3 FLl«ARVLGGEVAyGQiVLGVlGYSLLPL»IVl APVLLVVGSFE 
WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPLL1 YPI FLLYI Y 
FLSLYTGV 


5633 


771 


460 


CGCSKTMSVGRPFYRSSEFMEOLLSSHLHOVPFFCCFTWCLCN 
CLFENSVSKLYMLCFKFFrtSlFFYSLSITKLNLlYLWGLSYOSL 
LLI.LLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRP.^AGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLSSRFRRVDIDEFDENKFVDEOEEAAAAAAEPGFDPSEVD 
GIAjFOGDMLRAFHAALRNSP\WTKNCAVKERAC^VY1>K\TLTNFK 
SSE'I EOAV0SLDRKG VDLLMK Y I YKGFEKP TENSS A VLJ iQWH EK 
ALAVGGLGS 1 1 RVL7AR KTV 


5635 


3 


943 


DRGFNSTAVDTGRARVS FWRFPLDPGVKNSNVQ3 SCEK3RFRTL 
R S LFH P FPVTRSGAPRA VLVGS SWP AKMVAPAVKVARG W SGLAL 
GVRRAVL0I>PGLTOVRV."SRYSPEFKDPLIDXEYYRKPVSELTEE 
EKYVRELKKTOLI KAAPAGKTSSVFEDPVI SKFTNMMM1 GC-NKV 
LARSLM30TLEAVKRKOFEKYHAASAESOAT3 ERNPYTI FriOAL 
KHCEPMIGLVP1LKGGRFYQVPVPLPERRRRFIAMKWMITF.CRD 
VKHnRTT.MPFKTJWKT.I T^FPNnf?PVTKR)CHnT,HKMAFA*JRALA 
HYRWW 


5636 


2253 


1143 


LEDT1 COH P P AEKKLY h YH RK LRE V ERK G 3 PRLP KDV FMDTH 0X3 
LTD VRAKVTCS FSEGW i ? SVKGG FS S FSQ ATHS AAG AWS KFREI 
ASL3 RNKFGSADNI PNLXDSLEEGQVDDAGKALGVI SHFOSS PK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMOSSGFD 
ALLH E I QE 1 R ETQARLE E SFETLKEH YQRD Y SL 1 MQTLQ E ER Y R 
CERLEEQLKDLTELHGK EI LNLKQELASMEEKI AYQS Y ERARDI 
QEALEACQ7R I S KMELQ GXX30QVV0LEGLENATARNLLG K L I N I 
LIAVhlAVLLVFVSTVAKCWPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFSYVERFFSSPR 


S637 


94 6 


2532 


MSFCGARJ\NAi<mAAYKGGTSAAAAGHHHHHHHHLPKL?PPHLK 
HHHH PQHH LH PGS AAAVK P V0C HTS S AAAAAAAAAAAAAM LN PG 
CX?0PYFPSPAPG0APGFAJ^PAQVOAAAAATVKAHHHOHSHHP 
QOOLDIEPDRPIGYGAFGWWSVTDPRDGKRVALIOWPNVFONL 
VSOOVFRELKMLCFFKHDNVLSALDI LCPPHI DYFEE I YWTE 
LMQS DLHK 1 J VS PQPLS S DHVKVFLYQ J LRGLKYLHSAG 1 LHRD 
I KPGKLLVUSttCVLKI CD FGLAR VEELDESRHMTQEWTO YYRA 
PEILMGSRKYSNAIDIWSVGCIFAELLGRRILFOAQSPIOOLDL 
ITDLLGTPSLEAMRTACEGAKAHILRGPHKQPSLPVLYTLSSQA 



354 



BNSDOCID: <WO_015331ZA1_I_> 



WO OJ/53312 



PCT/USOO/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleot i de 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicteo end 
nucleotide 
lcca t ion 
correspond} ng 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment containing sics:nl peptide - " 
(A*Alanine, C=Cysteane, D=Aspaitic Acid, E- 
GlutanMC Acid, F-Phenylalanine , G=Giycine, 
H-Histidine, 1 ^ Ir.ol eucine, K«Lysir.<_, 
L-Leucine, M-Kethionine, N-Asparfccirit , 
P-Prolir.e, C=Glutamine , R=ftrcinin*. , 
S=Serine, T=Tr.reoni ne , V>=Valine 
W=Tryptophan, Y=Tyrosine, X=Unkncwn ( *=Stop 
Codon, /^possible nucleotide de:cticr., 
\=possible nucleotide insertion! 








THEAVHLLCRMLVFDPYKRISAKDALAHPYLDEtmRYHTCMCK 
CCFSTSIX3RVYTSCFEPVTNPKPDDTFE)0NLSSVR£VK£I IHQF 
JLEOOKGNRVPLCINPOSAAFKSFISSTVAOPSE'MPFSPLVWE 


563B 




1155 


DRKMSEl»D0LRCH^Cl*KNOIRDARKACADA:LS01TNWIDPVG 
R 1 QMRTRRTLRGHLAX1 YAMHWGTDSRLLVS ASCDGKL 1 1 WDSY 
TTNKVHAI PLRSSr.-VMTCAYAPSGNYVACGGLD?C ] CS 1 YNLKTR 
KGNVRVSRELAGHTGVLSCCRFLDDNQIVTSSGDT*rCALWDIET 
GQQTT T FTG KTG DVKS L S LAPDTRLFVSG ACD A S AKLW D VR EGK 
CRQTFTGHESD1N.MCFFPNGNAFATGSDDATCRLKDLRADQEL 
KTYSHDN11CG1TSVSFSX SGR LLLAG YDP FN C NV VCD A L KADRA 
GVLAGHDNRVSCU3VTDDGMAVATGSWDSF1.K 2 W\' 




125 


1155 


DRKMSELDQLRQEAEObKNOIRDARKACADATLSOlTNNIDPVG 
RIOMRTRRTLRGHLAKIYAMHWGTDSRLLVSASCDGKLJIWDSY 
TTNKVKAI PLRS SW VMTCA YAPSGNY VACGG LDN 1 CS1 YNLKTR 
EGNVRVSREUVGKTSYLSCCRFLDDNQIVTSSGDTTCALWDIET 

CROTFTGHESDINAICFFPNGNAFATGSDEATCKbFDLRADOEL 
MTYSHDNlICGlTSVSFSKSGRLLLAGYDDPNCr^DALKADRA 
GVItAf^HDNRV^Pl /=\rrDDGMAVATGSWDSFL.K j WK 


5640 


2B0 


1092 


WGNKKTML.SHIITMMKORKQOATAIMKSVHGNDVDGMDLGKKVS 
IPRDZMLEELSHLSIsTtGARLFKKRORRSDKYTFENFOYOSRAQI 
NHS I AMONG KVDGS :>TbECGSQQAP L7P PNT PD PR S PPN PDN I AP 
GYSGPLKEI PPEK?.\ T TTAVPKYY0SPWEQA1 SNDPr LLEALYPK 
bFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELL.L 
LTDPRFMSFVNPLSGRRSFNRTPKGWISENIPjVITTEPTDDTT 
VFESEDL 


5641 


2 » 


33 2 


rDUNrurnuvi t cunMnvt PMi.*iJ!.FTiruf:7 t y"\ ryi^ t HKI .1 n*v 

LKnn LNbUV AiiLd N^/LUiUjI* .H-r HL>t 1 r n\yLtX-ii~ ■ JL.LAjc> Jl \/l\±ji \/t\ 

EIILSDHSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 




1247 


I T PCRMDFLVLFL FY I*AS VLMG LVLI C VCS KTH £ L KG LARGGAC* 
IFSC1 1 PECXORAMHGLLHYLFHTRNHTFIVIjKLVLOGMVYTEY 
TWEVFGYCOELELSLHYLLLPYLLLGVNLFFFTIjTCGTNPGIIT 
KAXELLFLKVYEFD^WFPKNWCSTCDLRKPARSKHCSVCWWC 
VH R V OH H C VWVNN C I GAWN I R Y FL I YVLTLT A ? AA TV A I VS TTF 
LVHLWMSDLYQETY1 DDLGHLHVKDTVFL1 OYLFLTFPR I VFM 
LGFVWLS FLLGG Y L.LFV LY IAATNO/TTNE WY P G D W A W CQ R C P L 
VAWPPSAEPQVHRN 1 HSHGLRSNLQEI FLPAFPCK ERKKCE 


5643 


1 


847 


PSGG VR DVETRGPGSRAARG PR WMKRRGVGAG7*. ] AK K KLAEAK 
YIG5RGTVLAEDQLAQMSKQLDMFKTNLEEFASKHKOE1RKNPEF 
RVQFQDMCATIGVDPlASGKGFWSEMbGVGDFYYELGVOHEVC 
LALKHRNGGLITLEELHCOVLKGRGKFAQDVSCDDLIRAIKKLK 
ALCTGFGI I PVGGTYLIOSVPAELNKDHTWl^LAE KNGYVTVS 
EI KAS LKWBTERAROVLEHLLKEGLAWLDL0APG E A1IY WLPALF 
TDLYSQE I TAEEAR E ALP 


5644 


83 


1138 


PRRMGSWVOLITSVGVOQNHPGWTVAGQFOEKKKFTEEVIEYFQ 
XKVSPVHLKILLTSDEAWKRFVRVAELiPREEADALYEALKNLTP 
r^AIEDKDMOOKEQQFREWFLKEFPCIRWKIOESIERLRVIANE 
I EKVKRGCV I ANWSGSTG 3 LS VI G VMLAP FTAG LS LS I TAAG V 
GLGIASATAGI ASS 1 VENTYTRSAELTASRLTATSTDOLEALRD 
I LHDI TPNVLS FALD FDEATKM I ANDVHTLR RS KATVGR PL3 AW 
R Y VP I NWETLRTP G A PTR I VR KVARNLG KATSG VLWLDWNL 
VQDSLDLHKG E KS E £ A5LLRQW AQE LEENLN E LTH } KOS LKAG 


5645 


53*? 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLKSASSOVPTL 
YLCLQKSLLGHSSVEDARATMELY0ISORIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTS PHLLPTWLLSSCLPPANVTTKAATPPP LVLS LTTADP 
AGKPAPCRVTLTL.LRA5IPATKRASFLSSFIK.^FFEELEYILGF 
LS LLKFHVHVS VYS A I CHFQKEGTGNSRS FTCTP ELFPR LQTHL 
RAEGGAO 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 1 
EEDTQRHETYKQCG0CQVLVORS PWLMMRMG I LGRG LOE YQLPY J 
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BNSDOCID; <WO__0153312A1_L> 



WO 01/53312 



PCT/USOO/34263 



SEO 
ID 
NO: 


Predicted 
beginning 
nuc-3 eot ice 
iocat ion 
correspond! ng 
to first 
ami no acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
j ocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amno acid se-yment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acic, F=Phenylalanine, G=Glyc.ir.e, 
H=Hi£tidine, I -Isoleucine , K=Lysine, 
L = Leucine, M-Methionine, N«-Asparagint, , 
P-Prcline, Q- Glut amine , R-Arginine, 
S- Serine, T=Threonine, V=Valine, 
w=Tryptophan , Y-Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\*po5sible nucleotide insertion) 








0RVLPLPI FT PA KMG ATKEE R EDT P I QL QE LLALET A iA? GQC VD 
RQEVAE1TKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 




1516 


VLSSLCGKHEALREVGAEWPPPTCSPKICSGLQQAGNTDWSLTM 
A PQS LPS 5 R MAPLG M LLGLLMAAC FT F CLS H ON LK E F AL. 1 ji PE K 
SSTXETERKETKAEEELDAEVLEVFHPTHEWOALQPGC^VPAGS 
HVRLNLOTGEREAKLQyEDKFRNNLKGKRLDIMTNTyTSODLKS 
ALAKFKEGAEMESSXEDKARQAEVKRJLFRP1EELKKDFDELNVV 
lETLWlMVRLINKFNSSSSSLEEKIAALFDlEYiVHOMDNAOD 
LLS FGGLO W I NGLNSTEPLVXEY AAFV LGAAFSSN P K VQVEAI 
EGGAL0KLLVILATEOPLTAKKKVLFALCSLLRHFPYACR0FLK 
LGG LQVLRTLVQEKGTEVLAV R VVTLLYDLVTEKMK AF F. £ AELT 
C'EKSPSKLQQYROVHLLPGLWEQGWCEITVHLLALPE^DAREKV 
L0TLGVLLTTCRDSYRQDPQLGRTLASLOAEY0VLASLH LODGE 
DEGYFOEI.LGSVNSLLKELR 


5649 


117? 


3006 


KLQEQLDAINEEI RMIQEEKESTELRAEE J ETRVTSGSMEALNL 
KOLRKRGSlPTSLTDbSLASASPPLSGRSTPKLTSRSAAODLDR 
MG VMTLPS DLR KHR RKLLS PVSREEN3EDKAT I KCETS P PS S PR 
TLRLEKLGHPALSOEEGKSAbEDOGSNPSSSNSSODSLHKGAKR 
KG I KS S I GRLFGKKEKGRLI OLSRDGATGH VLLTDSEFSMOEPM 
VF AK LGTQAEKDRR LKKXHQLLED ARR KGM PFAQWDGPT W SWL 
ELWVGMPAWY VAACRANVKSGA1 MSALSDTE1QKEIG1 SNALHR 
LKLRliAlQEMVSLTSPSAPPTSRTSSGMVWVTHEEMETLETSTK 
TTSEEGSWAOTLAYGDMNHEWIGNEWLPSLGLPOYRSYFKECLV 
DARMLDHL.TKKDLRVHLKMVDSFHRTS1»QYG3 MCLKR1-NYDRKE 
LEKKREESQHE 3 KDVLVVTCNDQVVHIWQSIGLRDYAGNLKESGV 
KSALLAliDENFDHNTLALILQIPTONTQARQVMEREFNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLOPPPAPPKKIMPEAHSHYLYGHMLSAFRI 


5650 


1172 


3006 


ML^EOLDAINEEIRMIOEEKESTELRAEEIETRVTSGSMEALNL 
KQLRKRGS I PTSLTDLSLASASPPLSGRSTPKLTSRSAAODLDR 
MGVMTLPS DLR KHR R KLLSPVS REENREDKAT I KCETSP P5 SPR 
TLRXEKLGHPAI^QEEGKSALEDQGSNPSSSNSSODSLKKGAKR 
KGiKSSIGRLFGKKEKGRLlQLSRJDGATGHVLLTDSEFSMOEPM 
V ? AKLGTQ AEKDRR L KKKHQLLE D ARR KGMPFAQ W DG PT WS WL 
ELWVGMPAWYVAACRANVKSGAI MSALSPTE1 CREIGISNALHR 
LKLRLAI0EN5VSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAO^LAYGDMNHEWIGNEWLPSLGLPOYRSYr MECLV 
DARMLDHLTKKDLR VHLKMVPS FHRTSLQYGI MCLKRLN YDRKE 
LEXRREESOHEIKDVLVWTNDQWHWVOSIGLRDYAGNLHESGV 
H G ALLALD EN FDHNT LALI LQ I PTONTOARQVMEREFNNLLALG 
TDR KLDDGDDKVFRRAPSWRKRFR PREHHGRGGMLSASAETLPA 


5651 


64 6 


18S9 


AROG0R0PWG* EARA KGPASESPRV* EGSGWEGPASP* TPGSTL 
AWGEGAG3 R* ASGLTAAGAAS AAAA/ PPPTRGGPAPAGCGRAPP 

fJD» r>T DtJIifurDnD? DDCDKA Dp> TJR f "CUfSTlV TV 1 IVT.CPk C DZX^P 

WFAPLKVPTHbKArAKK&KAftrl<Ar/^l«n , s7 i AttrtfVuo Jt/vd r rtVj r 
AD P ♦ L PGKS S QS PPRG * RHGRS RS AP APAH PEH PA PAG S AS ASQ 
QTPGWPGSCC1^CK?WQAEPLGAPGAEDG\PVPPCRGFPLGTLGS 
P AGSWAGLAG YG * AGAPGTOATAPRAAGQT P VAAAPNCR V * GS A 
PALHRAPAAADPGSPLOAPPRAWASPAAAGPGLSSSDYCC-GLGA 
GWRAGISPELJX5AAGLSDNWARCPGPGPAE *GGCPGCRTI PAS A 
CM PS P PVEG S LGLSR KGHGDLPSQAR * GWHECRRARHLV F L PRL 
LGPRGRT3RPSSPS 


5652 


735 


343 


HHKKY0H1HOKSFSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
CiFCARSFRTSSNLVIHRRIHTGEKPLOCEICGFTCROK^.SLNW 
TOR KRAETVAALRF PCEFCGKRFE K PDS V AAHR $ XS HPAi I.IjA 


5653 


66 


1401 


RGRLOSRGRLTLGLVLLLLDILGAROHGORVSHGWKGGFLTAPL 
CFPOPCOPGTRRGRRRSLKEATEPQIAKAEEFVTLKDVGKDFTL 
GDVJE01XjLECGDTFWI!)TAXPNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGS PEATS PDVTETKNSPLMED FFBEG FSQEI /SR D V I Q 
GWLLELQFRRSLYFGHLVR* FARRSRKSSEV* YCHQRGKS HGMQ 
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BNSDOCtO. <WO 0153312A1_I_> 



WO 01/53312 



PCT/l'SOO/34263 



SEQ 
ID 
KO: 


Prtc: cted 
bee : nni no 
nuc j eotidt- 
loci, t ion 
coi 2 espencang 
to first 
amivio acic 
res: due of 
amino acic 
sequence 


Predicted end 
nuc i eotide 
location 
con esponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ocid segment containing signal peptide 
(A^.ianine, OCysteine, D=Aspartic Acid, E=r 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K=Histidlne, i^lsoleucine, K=Lysine, 
L= Leucine, M=Methionine ( N=Asparagine . 
P-^ Proline, O^Glut amine, .R=Arginine, 
£=Serine, T=Threonine, V-Valint, 
V- Tryptophan, Y=Tyrosine, X=Unknown / *=Stop 
Codon, /-po3sible nucleotide deletion, 
^possible nucleotide insertion; 








ES * IK-EKTOSCVKRFHGRRFHG\DNVSEKTLTPAKSKEYRGEFP 
SYSDHSUQUSVyEGEKPYOCSECGKSFSGSVRLTQHKITHTREK 
PTVHOECEOGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSOSTY 
1WK0KTHAGEKPCKSQDSDHPPSHDTOSGEHQKTHTDSKSYNCN 
FCGKAFTR1FHLTRH0KIHTRKRYECSKC0ATFKLRKHLIQHQK 
THAANV 


5654 


3 


E98 


TLPLFPGRRFRGWRRCGAVAARKNSTGGKVS1NORRDSVRMSAL 
NWKPPnTYGGLAS ITAFCGTFP1 DLTKTRFQ3 OGQTNDAKFKEI I 
YRGMLKALVRI GREEGLKALYSG * VGLHAFLCHCSLFHMG IDFR 
PRLHRSOVKSL-RCV* KEQ1A* ♦ /MFSLLISTLI SKYI YYAADVL 
EKLFVYI OVOTDNNKKI CLFKN1 


S655 


2 


867 


rppgiraprolhpaagrrpdasarprfrptvllhdpfolsfppp 
plsypsvfpavarvlporsgdyraagmpolsggggggggdpelc 
atdemipfkdegdpq\rekifaeivnpeeegdladixssi,vnes 
el 1 pasnghevarqaqtsqepykdkarehpddgkhpdgglynkg 
psyssysgyimmpnmnndpymsngslsppi prtsnkvpwqpsh 
avhpltpl1 tysdehfs pgshpshi psdvns kqgmsrkp papdi 
ptfyplspc-gggqitpplgwqgqp 


5656 


228 


! 1066 


FRKVPPLPEFASGPGAAFFHSGRLORSLTKDSAGCKSOCRSRAM 
LVLRSGLTKALASRTLAPQVCSSFATGPROYDGTFYEFRTYYLK 
PSNMNArMENLKKNIIILRTSYSELVGFWSVEFGGRTNKVFHIWK 
YDKFPHRAEVR KALANCKEWQEQS 1 1 PNLAR3 DKOETE ITYLI P 
WS KbQK P P KEG V Y E LAVFOMK PGGPALWGDA FERAI NAm'NLG Y 
TKVVGVFHTEYGELNRVHVLWWNESAD3RAAVRHKSHEDPISWG 
GVRESVNYL\VSQQMM 


5657 


10S 


3052 


GORLQSPRVQMPVOPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSOTESEEESSEMDDEDYERRRSEGVSEMLDLEKQFSELKEXL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQKSLKI R1QVAG 1 Y 
XGFCLDVI RNKYECELOGAKOHLESEKLLIjYDTIjQGELQER I OR 
LEEDROSLDLSSEWWDDKLHARGSSRSWDSLPPSJCRKKAPLVSG 
PYIVYML0E1D2LEDWTA3KKARAAVSP0KRKSD\DLDFAVHSQ 
GDPOSSWHCrODSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVKTPPL 


5658 


234 6 


;-54i 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NOLLLALLKCTDTELQLRRDAIFCOALVAAVCTFSEOLLAALGY 
RYNNNGEYEESSRDASRKWIjEQVAATGVLLHCQSLLSPATVKEE 
RTMLEDI WVTLS ELDNVTFS FKQLDEN YVANTNVFYH I EG SRQA 
LKV1FYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSCAAEDLOOD1 NAOSLEKVQQYYRKLRAFYLE RSNLPTDAS T 
TAVK1 DQLI RP INA^DELCRLMKSFVHPKPCAAGSVGAGLIP: S 
SELCYRLGACQMVMCGTGI^QRSTLSVStEQAAILARSHGLIiPKC 
I MQATD 1 MR KQG PR VE I LAKN LR VK DQM PQGAP R L Y R LCQ PKMN 
GDL 


5659 


2 


696 


WKRSGEVSPKGELGAWRGNSGRPKIIGRAAEAENEDRTLGRLbP 
GNERSQ PR S PLR LLAPQLKAEAAADKG1JVPVPPPFSSGHSGPC \ 
EREGEGORGRGRSRRGAH LELKPS PGLRAGAPTDRGRGG P AE VA 
AAGGRK MVQKESOATLEERESELSSNPAAS AGAS LE PPAAPAPG 
EDN PAGAG G \ AAVAG AAGGARR FLCG WEG PYGRP WVW EQRKEL 
FRRbOKWELNTYL 


5660 


229 


853 


PVTM«AFSEIjrMPLLINljIVSW-it>r vA7 V I L»l PArKvjHr lAAHL 
CG0DLNKTSR0Q1PESQ5V1SGAVFLI1LFCFIPFPFLKCFVKE 
OR KAFPHH EFVALIGALLAI CCMI FLG FADDVLNLRWRH K LLLP 
TAASLPLLMVYFTNFGNTTIWPKPFRPILGLHLDIiGR*SYHCC 
PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


lklyps pcggi pklpglpreaaaalgas flaeaplpvtvrgsgl 

AGr4A VTCD P XAFLS I C FVTLVFLQLPLAS I CQN * GTDS CAS RG K 
AD FDVTG PHAP 1 1 AMAGGHVBLQCQLFPNI S AEDMELRW YRCQP 
SLAVKMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LR KEGRCR RGSNRG VWAAPAEGliGGRGMLGVRCbLRS VR ?CSS A 
PF F KHKPSAKLS VRDALGAONASGERI KI QGWI RS VRSO KSVLF 
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SEQ " 
3D 
NO: 


Predicted 
beginning 
nucleotide 
1 oca t ion 
corresponding 
to first 
amino acid 
r e s i cu © of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ci 
aiTiino 3cic 
sequence 


Ammo acid secment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine . G^Glycine, 
F.^Histidine, 1 = Isoleucme , K=Lysine, 
L=Leucine, M^Methi onine , tf^Asparagine, 
P=Proline, Q=Glutamine , R-Arginine, 
S=Serine, T=Threonine, V» Valine. 

Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHVNIX5SSLES1»0VVADSGLDSRELTFGSSVEV0G0LIKS PSKR 
ONVELKAEK J KV 3 GNCDAJOFF 1 KY KERHPLEYLRQYPHFRCRT 

LFQLEP SGXLKV PEEN PFNVP AF LTVSGQimE VMSGAFTQV FT 
FGPTFRAENSQSR RHLAEFYMI EAEI S FVDS LQDLMQVIEELFK 
ATTMrtVLSKCPED^LCHKFl APG0KDRL*HMLKNNFLl I SYTE 

PbTLKPFYKRDNEDGPOELEGSVA'HSLGLMlLLSIWIGQP 


5663 


115? 


69? 


PADIGRSTAKTPGPPRSLEMDDPRYGKCPLKGASGCPGAERSLL 
Vvi>Yr tK.t»rijl r KUvAiLr bijLL.wyv,l^Ub> 1 »y\A>L»i KlsVril>z.NiK 
NLVFI^GIALTKPDLITCLECGKEPWKI KFHEMVAKPPV3 CSIIFP 
ODLWAEQDlKDSFOEAILKKYGKYGtlANFQl.OKGCKSVDECKVH 




118 


S7S 


SLSMESNHKSGDG1.SGTQKEAAL.RALVQRTGYSLVQENG0RK.YG 
GPPPGWDAAPPERGCEIF1GKI,PKDLFEDELIPLCEK1GKIYEK 
RKMMDFNGNNRGYAFVTFSNKVEAKNAIKOWINYEIRNGRLLGV 
CASVDNCRLFVGGI PKTKK 


566S 


347 


702 


WQH1.I IL»LHCERT£ PAMITSE1,PVLQDSTNETTAHSDAGSELE 
ETEVKGXRKRGRPGRPPSTNKKFRKSPGEKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKliGKS7iM0RC 


S666 


213 


540 


VSCLPTSCKMITUWODQPVPFNSSHPDEYKiAALVFYSCl FI I 
GLF VN I TAJLiW V PSCTT KKRTTVT 1 YKMNVALVDLI FI MTLPFRM 
FYYAXDEWPFGEYFCQILGA 


5667 ' 


1 


6 95, 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPMLPKRRUAnVGSP 
SCDAAS STP PS TR F PG VA I Y LVE P R MG R S K RA F LTGIAR S KG FR 
VXDACSS E ATHVVME ETS AE EAVS WQE R RMAAA P PGCTP P ALLD 
ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACOR 
PTPLTHHNTGLS EALE1 LAEAAG FEGS EGRLLTFCRAAS VLKAL 
PSPVTTLSQLQ 


5668 t 691 
1 


89< 


CSFLFClPDLPLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTA1 RCAQA0TG 1 DLSGCTK W 


5669 | 407 

i 

1 


! 1 


DSGAPEGI>SFLMST0EGLSMHAHPC>AYTPF1YLHARKRRGEIGD 
ADSRFNDRYAHKSAQLYFLYFVCWI FODVYYPTI KEKNHFFFPK 
ARGAPTKYSGS PIGS PTTTPPTR F PS FNLHFAPHLLASMQLQKL 
NSC 


5670 | 3 

1 

i 


373 


SSECLTMAWIPLLLPLL3LCTVSVASYELA0PSSVSVSPGQTAK 
I TCSGDVLAKK YAR W FQQK PGQAFVLV I YKDTER PSG I PER FSG 
cTCPTTurr tt cn&n\n?npnnvt'rvraTnMPT.WVP 

£> 1 t>Kzl 1 V 1 L/ ± A. OWMyv LUCAL' If L I o« 1 Ulvr 1j*t v r 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFM0LLLDQXHEHL1CWTSNDGE 
FKLLKAKKVAKLWGLR KNKTNMNYDKLS RA1RLLFMT 


567^ 


2 

I 


5S7 


FVPATPDPG VWLPPS RDPAMA KRSSJ,Y I R I VEGKNI>PAKDI TGS 
SDPYC J VKVDNBPI I RTATVWKTIXPFWGEEYQVHLPPTFHAVA 
FYVKDEDALSRDDV1GKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSKSETS PLGS VW S PAQGKP FLLS F EAGA1 FCTPGLCf AACS 
QAWLLLPLP 


5673 


327 


b yt> 


KLSSQTLI QAGDDEKNQRTITVN PAKMGKAFKVMNELRSKQLLC 
DVMI VAEDVKI EAHR WLAACS P Y F CAM FTGDMS 


5674 


17 


9B4 


GGGSMEGES TSAVLS GFVLGA1AF0HLNTDSDTEGFLLGE VKGE 
AKNSITDSOMDDVE VVYTIDI QKY I PCYQLFSFYNSSGEVNEQA 
LKKI LSNVKKNWG H Y KFRRH S DQ 1 MT FR ERLLH KNLQEH FS KQ 
DuVFLLLTPSIITESCSTHRLEHSliYKPQKGUFHRVPLWANLG 
MSEQU3YKTVSGSCWSTGFSRAVOTHSSKFFEEDGSLXEVHKIN 
EMYAS LOEEDKS ICK KVEDS EGA VDKLVKDVNR LKRE I EKRRGA 
OIOAAREKNIQKDPOHNIFLCOAJLRTFFPKSEFLHSCVKSUCID 
MFLKVAVTTTTI SK 


5675 


80 


753 


EGSRRGPTRIJWLSAJU^GRLHFPPGFSSRLIHFRGVSECRRPPG 
KSGVPVSAPGSDGKkn^EERPGMFSLMASCCGWFKRWREPVRKVT 
tiLMVGLDNAG KTATA KG1QGEYPE DVAP TVGFS K I N1>RQGKFEV 
TIFDIjGGGIRIRGIVJKNYYAESYGVIFVVDSSDEERJMEETKEAM 
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WO 01/5331: 
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ID 

NO: 


Preriictec 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
arnino acic 
seoruence 


"Predicted 
nuclect idt 
locat ion 
ccr responding 
to first 
amino acic. 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A: Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Fhenylaianme , G=Glycine, 
H-Kistidine, islsoleucme, K=Lysine, 
I.=l*r*i:cine, M-Methionine , N= Asparagine , 
P=?ro}ine, G=Glutamine, R=Arginine, 
S=£erine, T=Threonine, V*--V<>line, 
W* Tryptophan, Y-Tyrosine, X=Unknown, *»Stop 
Conorv, /^possible nucleotide deletion, 
\-possik>j.e nucleotide insertion) 








SEKiIjRHPRZ SGKF 1 LVLANKQDKEGALGEADVIECLSLEKLVNE 

HKCL, 


5676 






FVSSPPPKPVOPARPGGFGLSGRPSLLCQVASTPAHVGVMRSPV 
RDl^RNDGEES TDRTP L.LPGAPRAE AA. P VCCS AR YNLiA I LAPFG 
FFIVYALRVNiiSVALVDMWSNTTLEDNRTSXACPEHSAPIKVH 
HKQTG K KYQWDAETOG W I LGS FFYGY 1 1 TQ1 ?GGY VAS KIGCKM 
LLC- FG I LG7AVLTLFTP I AADLG VGPL I VLRALEGLGEGVTFPA 
M HAJ'IW S S W A P P LER S KLI ,S I S Y AG AC LG TV I $ LPL SG 1 1 CYYMN 
WT^FYFFGTIGIPWFLLWIWLVSDTFOKHKRISHYEKEYILSS 

L ! 


£677 


j 


102fr 


PFR DG FLELR R LS VPLCSGPCPLTSL5 RCGERSGGKLVAAARAA 
VTAKTKPLPL1APLAVC0SVKSPAACCVRPR?RAVALPAALGGP 
GRS LPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLWLIKH 
RKHAGPI VSVWHRELRKAKSNRKLTFLYLANDVIONSKRKGPEF ; 
TREFESVLVTJAFSHVARFADEGCKKPLERLLNIWQERSVYGGEF 
1 0QLKI>SMEDSKSPPPKATEEKKSLKRTFOQIQEEEDDDYPGSY • 
£ PQDPS AGPLLTEELI KALQDLENAASGDATVRQK1 ASLPQEVQ 
DVS LLEK 2 TDKEAAERLSKTVDEACLRKRG PGTS 


"56^8"" 


3 


593 


SSS V ?SSTPSLPL,PFYL»LLGOIjRLOLI>'GTAHL>SGAGEAAPCPG 
GSGRTAAPRTRADFMQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAE FTE0FNQLHNRRT1ENLQLGPLGRDPPQECSTFS PTDSGEE 
FGQ1 'SPoVOFQRROKORRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALCPDFDVSKRLSLPMD3 




2 


€23 


LNSRVDDFVAVPGA1MDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLIIKFSTRDY I MEP S 1 FNTLKRYFQAGGS PENVIQL 
TSErJYTAVAQTVNLLAEWLlOTGVEPVQVOETVENHLKSLLIICH 
FDPR KADS I FTEEGETPAWLEQM1 AHTTWRDLFYKLAEAHPDCL 
MLNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


256 


592 


RRLTSTSEKLQNRNSHTPLESL IHPOPSYKGFGIMFGKKKKKI E 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWKSLLADTANRPKPMV 
DPS C I TPIQLAPMKTI VRGNKPC 


5681 


45 


869 


LLCAKTLGVRTKESQAEGYNRSG1NNHOAEDPRFCPSFCWMRSA 
RCTRPQRXRK£AARPPTPGSCPGGTGMtX3KKCSVWKFLPLVFTL 
FTSAGLW1VYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS C V F S QVMNWAAFLA LWAV LR Fl Ct»KFKVLNPVJLNISGLVA 
I .CLASFGMTLLGNFQLTNDEEIHm'GTSLTFGFGTLTCWI OAAL, 
TLKWIKNEGRRVGIPRVZLSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 


5682 


39 


622 


PSR S CLGTMR KWRHREVNLPSVTQQDAVCPAPI PS PGLSAQTGL 
QKIWGTIHCOVCPGAPAWPGSPWHEEMGLLLLVPLLLLPGSYGL 
PFY^GFYYSNSANDONLGNGHGKDLLNGVKLVVETPEETliFTYO 
GASVILPCRYRYEPALVSPRRVRVKWWXLSENGAPEKDVLVAIG 
t Duccpr'nvnr'DiAJi unn 

JL*KM T\ o r La iJ I LA* K V n LiXV U 


5683 


B9 


778 


GSCG ATALI TR CLAWS VX I SRLAMATY TCI TCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFOERVRAQRAVAEEESKGS 
A.TY CT VCS KK FAS F"NAY ENH LKS R R H V E LEX KAVQ AVNRKVEMM 

N tfJNljX'iN.VjljCaVIJo V -JM^/inlMAM 1\^\J/\1 ri-r\\J r r IVArtr ir t\rt\T< 

EARNVVAVGTGGRGTHDRTJPSEKPPRLQWFEGQAKXLAKHSEDD 
SEDEEHDLC 


5684 


195 


677 


TWCFRGYLGPbVlMXALDEPPYLTVGTDVSAKYRGAFCEAKIKT 
AKRLVKVKVTFRHDSSTVEV0DDHJKGPLKVGA1VEVKNLDGAY 
03AV2NKLTDASWYTWFDDGDEKTLRRSSLCLKGER:-iFAESET 
LDQLPLTNPEHFOTPVIGKKTNRGPRYE 


5685 


779 


1262 


LLLOQPWHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAIVTPQ 
VK0EHPHTDSDLMHVKPQHEQRKEOEPKRPHIKKPLNAFMLYMK 
EMRAlWAECIXKESMI^IlXJR^KliAl^REEQAKYYElARKE 
RQLHMOLYPGWSARDNYVSPSSIPVALHS 


5686 


128 


1181 


CTWWQVN I TLUDINDNHPTWKDAP Y Y I NLVEMTPPDSDVTTWA 
VDPDLGENGTLVYS IQPPNKFYSLNSTTGKI RTTHAMLDRENPD 
PH EAE LMR K I WS VTDCGF PP LKATS S ATV FVNLLDLNDND PTF 



359 



BNSDOCID <WO 0153312A1J > 



WO 01/533 J 2 



PCT/USOO/34263 



ir 

N'C : 


Predi ct.ed 
beginning 
nucleotide 
location 
corresponds ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Precieterc end 
nuc} eot ice 
locar. i CD 
corresponding 
to first 
amino acid 
re&iouc of 
amino acic 
sequence 


Amino £c;d segment con:<.jn:nc signal peptadc- 
<A=Alanane, C=Cysr.einf: . D=Aspartic Acid. £r 
Glutamic Acid, Phenyl Is nine , G=Glyeine, 
H^Histidine, 2 =1 sol cue: ne , K=Lyeine, 
L^Leucine, MvMethionint , N-Asparagine , 
P^Prolinc, Q-Glutamine , R-Arcinine, ! 
S*Serine, T-Th,reonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=$top 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion? 








qnlpfvaevleg jpagvs i yqvvai dlneglnglv'syrmpvgkp [ 
rmdflinsssgvwttteldrbf.:aeyolrwasdagtptxsst 
£ tlt i hvldvn detptf fp a vykv s vs ed vpr \g5gwsg * aakn 
ndvg lnaels y f i tggnvdgkfs vgyhdawrt wgldr ettaa 
ymlil-eai dngpvgkrhtgtatvf vtvldvndkrpi iloss y v 1 


5687 


17 


Sll 


AAPPAPPCG/PPP/PPPAPPT/PCPAA/APASSCQPRLSAGRAA J 
OGDGGAAA"VGHVLWPAVGPVRW?GLOTPVPRPELLPGP\SSS 
LHSDSS YP PDAGLSDDEEP PDAS I . P PDPPPLTV p/ ADA/ PMP V7 : 
SGCRMPSTSASE/AAGG0GACXrjvKGSETPPPAS?0TSEPA?5>P ; 
LP PH L»TGG P G M Y S S E1AKL P N S F S C LG LAGTG AG I * GT AS AHG TG j 
PPVtiPHVCTPSLANPQP\AVGPL/'SSLPLGVSGIGMSA/$APIS 1 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGOGP 
VhDJ | 


568b 

1 


3 


420 


LTKWDLFGKCYRLLKTG J EHGAMFEQVGVYKYS/CLYDSRKLFF | 
* SHMI 1 RS LL* KVI DDSLGQI .PLLKELLL* * LNVI DRC1 J I A YV 
LRVEKTFA1TYLKNFTVKVDFSLLGEIPLISMAAILKLW1MKID | 
DGY1PAVF 


5689 

1 
| 

! 


2504 


3 


HEL5GKHI SMVSGNTCNWHPGGHS PGGGGQC-E I T5 KDRGE I PAL 
TWA/RK?1GTWTATKPTHRAG*GC ^EEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGRE1 GKEGSLP FLG 3- KALGP* SASCORAFEGG AH 
GSTARKPAPATPGTRHPRTMETPF VAQGWPAGPRSQFWDQHPHS 
PGEHRPSG \£FL PACP PRAW PKAG A VAS ATGTG \ PQLPGSRGKC. 
KLPRTREP PLLQAGVJAVR KP PWSL/ JCEGLGQAGRPSGMDS SAS \ 
PQTPGGRGS LE WGLPLYLG PHHDVK * RSDRLG* PP * GGQGGGGH 
GAPSTPGPGGE AW * LPQQTS RPKF G PQAY * GE\ GS PGLQCPCS K 
EL*RVPPGSLGPSTQCKYEPTDKH£\GGABAQLEVSTAGSKSTF 
GOELKGPLDAGRLWPGAPSASSSHK*GG*ERARAGAGHRGST*A 
SSKIEOGRPRPGPTSDALADVEGG^ES/GPHPWPLPGTLPNR/P 
GSPPPA* ASAGR KGTVSTLGGGLi 


565>C 


142 4 


58 


P S P P AGVCAA P A P LP LLALARRDR m F C S PG AEAAP WQTGG PA I D 
GAWRTSVSALRRGATG/APCSPGAEAAPWQTGGPAIDG\DGELP 
^VRSEEAPRGCGAEGGGPGSGPVRKFGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHyNRHVRRJDPQAPPGGPAP 
GHAAALP£RTRGVAEPPAWAHAGSCAKRAGR*SQRT* ERARPRH 
PTFOGRAGS\GOPGYQPPh 1 FHPGPSSFPAAP\GPRGA*GNPQLE 
KAPRSDRNPSOGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLSLLGP/PGAHNLDTAPODR*HGF*GDKRGAPGVAGEDPRPP* 
GNFVR* LLLMP/GVA* RHGTSP? LCPSLGENGGQMDSGNLFGTP 
KG ♦ SHPAFT KST ♦ SME AEKS Y WNHP>IR \DRGRQGVR I NCLRVGE 
CEMWGPYSAPRPGTVFLSSFLSPA5EEH\PEGSSSfNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


ESO 


isndpspgynieomakrgkklvelfytvkgmdvsfsg:lsfied 
vahrmlatgectpedl-cfslqvt^o* ktgt£swg*rfyiveqn* s 
gdapli fsp ylsltgncgfamlvei teramah\cgspggpslwg 

GVGVYVLLESVPLS ys 


56 92 


1193 


548 


TOAWTRAEKDRkGSVRALRl J KLERGFPT*RGSHPL\QSVPClQK 
PS I FSS YPI /GLPQSGGEPGPVGE0QPVRRPEQPSCGPAS3MPL 
TSRSVPPGRGALPPDS LSTR KGLPR PSTAGHRVRESGHKVPVSO 
RLNLPVMGATR SNLQP PR K VA V PG PTR * RDQDSKQDFSS KPLCS 
VPGLASTGXJTLTPADSGPGTGGRDATRAGLPGVETWGNGVD 


5693 


1258 


1330 


ALTWPVRXGTTWWAOPJiGCSNLVSRARLDLSSRPSONTEPQAP 
*QAGPPSSLRPP\SRRR*APEWPXKATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAlWfRSSRFPLWFPL»RCCFWVSGFKDPNPVLRFF 


5694 


3 


i33e 


GSKEPARSLHRRGSGHKSSAGKWGSVrLSTAGALG*KQLHQ*WT 
0RCL\NN1>SS ESFNASS S LNSLPS T PTASRRNS T I VLRTDS EKR 
SLAESGLSWFSESEEKAPXKLEYDSGSLKMEPGTSKWRRERPES 
CDDS S KGGELK KP 1 5 LGH PGSL K KG K TP FVA V?S PI THTAQS AL j 
KVAGKVEGKATDKGKL&VKm'GLQRSSSD&GRDRLSVAKKPPSG , 
lARPSTSGSFGYKKPPPATGTAIAWTGGSATLSKIQKSSGIPV j 
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BNSDOCID: <WO 015331?A1J_> 



WO 01/53312 PCT/US00/34J63 



ID 
NO: 


Predict ec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


""predicted end > 
nucleotide S 
location 
c:or responding 
to fir?: 
ammo acid 
residue of 
ana no acid 
sequence 


Amino acic seumcr.t containing sional peptide 
(A^Al&nzne, OCyrteine, D=ASpartic Acid, F>. 
Glutami c Acid, .^Phenylalanine, G-G1vciik., 
H=Histidine, I Oscleucine, K=Lysine, 
L-Leucine, M=Mer hi on i ne, N=Asparagi ne , 
P=Prolme, Q-Glutamine, R=Arginine, 
S=Serine, T^Thrccnine, VsValine, 
W-Tryptophan, Y^lyrosine, X=Unknown, *=Stop 
Codon, /=possiblr nucleotide deletiou, 
^possible nucleotide insertion) 




i 

i 




KpVNGRKTSLDVSNSAh:PGFLAPGARSN10YSSLPRPXKSSSNS 
VTGGRGG PRP VSSS I DPSLL»STKQGGLTPSRLK£PTKVASGRTT 
PAP VNQTDKEKEKAXA KAVALDSDNI ShYSIGS PESTPKNQASH 
PTATKLAELPPTPLRA^'AKSFVKPPSIANLDKVNSNSIiDLPSSS 
DTT0C1 


56 5 5 


3 


1338 


GSKEFAR51.HRRGSGHK£SAGKWGSVT1jSTAGAU:-*KQLHQ*WT 

0rc1»\nnlsseefnassslnslpstptasrrnst1 vlrtdsekr 
slaesglskfseseekapkkleydsgslkmepgtskwrrerfes 
cddsskggelxkplslghpgslkkgktppvavtspithtaosal 
kvagkpegkatdkgkljwkntgi.qrsssdagrdrlsdakxppsg 
1 arpstsgsfgykkpppatgtatvmqtggsatlsk1qkssgipv 
kpvngrktsldvsk s aepgflapg arsn iqyrs lfrpaks ssms 
vtggrggpr p v5ss i dpsllstkqggltpsrlkeptkvasgrtt 
papvkqtdrekskakakavaldsdnislksjgs pestpknqash 
ptatklaelpptpbrataksfvkppslanldkvnsnsldlpsss 

DTTOCI 


5696 


3 


133 8 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KOLK0*WT 
CRCI>\NNLSSESFMASSSLNSLPSTPTASRRNSTI VLRTDSEKR 
£LAESGL£WFSE£?EEKAP:<KLEYDSGSbKMBPGTSKWRRERPES 
C DPS S KGG ELKKPIS LGh'PGSLKKGKTP PVAVT SPI TKTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IAR PSTSG SFGYKKPP P ATGTATVMQTGGS ATLS K10KSSGIPV 
KPVNGRKT^LDVSNSAEPGfLAPCJARSNIOYRSLPRPAKSSSMS 
VTGGRGGPHPVSSSIDPSLUSTKQGGLTPSRLKEPTKVASGRTT 
PArVNQTDREKEKAKAKAVALDSDNISLKSlGSPESTPKNOASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNS^SLDLPSSS 
DTTQCI 


5697 


1J47 


4 7 


PSEAXSPPACPSAPAPRRS I ISRLFGTS PATEAAPPPPEPVPAA 
QGPATVQSVEDFVPDDRLDRSFI^EDTTPARDEKKVGAKAAQCOS 
DSDGEALGGNPMVAGFODDVDLEDQPRGSPPLPAGPVPSODITL 
SSEEEAEVAAP7KGPAPAPQ0CSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGP1 AAQMLSFVMDDPPFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
ECKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAEPOF.DLPPLSGS5RFFOEOOKMNKSIX5PVSFKDVAVDFT 
0EEKO0LDPEOK1 TYRDVMLENYSKLVSVGYHl 3 KPDVI SKLE0 
GEEPW1 VEGEFLLCSYFDEVWQTDDL1 ERIQEEENKPS.RQTV7I 
ETLI *R/ERGhJVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
R ASSEY 1 S SDGR YARMKADECSGCGKSLUn KLEXTH?GDQAYE 
FNQ 


5699 


2 


1448 


RVROPPGLWVRRTVPAMOCPAGLSRVPGVAG/DPSLPSFKGPRD 
EAAHRGTI OTARHTR KLYVQGPASGPPLPRVSTOVA I *DEKP1A 
R PS /G RTNAPFPQGQKPAGKAAPGPAAAGRVAMR \ P GHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL+RS 
TWLVGGARGPEGSG VRGSGWPSGCSD1 GWALAGWIWS »HLDPNT 
WTQICWTGE/SPAPGEEGWAPAPRGPTAEHGHCELTTESOYSNN 
VP ILFQNPS GALRS RRTE PAGWVP PTRHE* DDG * TAAPASGGAP 
VS TPTWAGTP / LNAS LG PTDPQG K PGCRPPCALP K PAGP E R 5 A * 
GGS LGCR / S MLPASSG PP PAPG PRRLAAGA3TS ASARCP PAAAA 
GW0PRRPGFAGRAAUPGPPKPPSS*RELGGLPGPGW*7LDPLPA 
HPAHPPGSA PPWGALGGWAAARASLPWS PSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 


NGHKGVWE INI Y*RRSNI HKNSKSESHLNQDHS FFPPTPNSARS 
KLHSrGTAKNTGLPljSGAPRORAVPSGRTICOEFSSCLOCAYLD 
E* CSIASSLI KAILRVSVLSE 


5701 


59 


410 


1FEK3CSDTOEFISPEINP0ICSWLIFDKGAK/NKATGKDSLFN 
KWSWJOWLSTCR*MRPGPYFTPYTKlNSK* IK/DANIRCETVKL 
LE ENTGENLKDTGLGNV FLDMTP KTQPTKQK 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/533 J? PCT/HS00/34263 



SEQ 
ID 
NO: 


Predicted j 
beginning 
nucleot ide 
loce t : on 
ccrrt spending 
to first. 
amine arid 
residue of 
amino acid 
sequence 


predicted end 
riuci ect ice 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment containing signal peptide 
IA=Alon.ine, C-Cysteme, D-A^pariic Acid, E = 
GluLe-ic Arid, F=Phenylalsn:i ne , G=Giycine, 
H=Histidme, 1 =Isoieucine, K=Lysine, 
L=Leucine, M=Methicni ne , Nsi-.cparay.im , 
P=Proline, QsGlutemine, R=Argininc, 
S = Senne, T-Threonine , V=Vaii:ie, 
WcTryrtophan, Y=-Tyrosine, X^-Unknowr., *-Stop 
Codon, /"possible nucleotide deletion, 
\-pcssibie nucleotide insertion) 


5702 


2 


1517 


ETFVDPSOCGGIPSDSPHPVITPSRACESSASSDGP.MPVITPSH 
ASESSASSDGPHPVIT?SRASESSASSDGLH?VITPSRASESSA 

PVITPSRASESSASSDGPHPVlTPSWSPGSDVTLlJiEALVTVTN 

1 EVI NCS1TEI ETTTSS I PGASDTDLI FTEGVKASSTSDPPALP 

DS TEA K PH I TEVTAS AETI^STAGTTESAAPHATVGTP L PTNSAT 

EREVTAPGATTLSGALVTV£?KMPLEETi : ALSVETP£YVKVSGAA 

PVSTEAGSAVGKTTSFAGSSASSYSPSLAAlaKNFTPSETLTMDl 

TTKGPFPTSRDPbPSVPPTTrHSSRGTNSTLAKlTTSAKTTMKP 

PTATPTTARTRPTT\A*V0VKMEVSSS('G-'VWLPRKT$LTPEWO 

KG*C^.S?;TGNSTPTRLTSRSP^•CVSGE^^NG/PSAAAX^HVPYAKR 

GCCP* FKJPPPTIX.SCVTVLRGTQKVPMKGSMSKPL 

LTSTGV^yVWGGASPVPRGVLGLTLAHVLCFSKEKI 


S703 


u 1 


1117 


HHKDPRSOGLPRTOECARPELRPl»bCPRALWPVTRLSYRCPWQA 
PKAG1GTKAKPSESHLKLHPGWPSLDR0GEPATLGTGTGHCSDS 
R I LR WHP * HTAAR* PRWRRLPSSHR WTFH L<3V1>R VQDKS * * VSL 
DPSCRPRF1.RTC** YGttRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPG S ♦ W CPWL * AAR WTGWRTASG AS AG LGRAABR P S AWA 
RRVAGLLPGOGLTVRR* H* TAGAPAS VP S SOGATRS T APGGDCC 
ACGRGPGSC* HPPFWPVSPSSPVPCPSGK * HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


V? 


562 


GDYEFDSF^WDDISOAAKDLVTRLMEVECDORITAEEAISHEWI 
SGNAAS DKN I KDG VCAQ 1 E KNFARAKWK KAVR VTTMi KR LRAPF 
OSS TAAAQ S A S ATCTAT PG AAGGATAAAA SG ATS AP E GDAARAA 
KSDtvTVA^KRP*LPPOP0KEVPPQPLWAVSP0PPMEASLQPLMGE 
SPQP 


5705* 


23 


562 


gdyefdsfywddiscaakdlvtrlmevecdqritaefjoshewi 

BGNAASDKNI KDGVCAQI EKNFARAKWKKAVRVTTLMKRLRAPE 
QSSTAA^CSASATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVA PR R P * LP PO POKE VPP QPLMAVS PQ PPMEAS L0 PLMGE 
SPQP 


5706 


3 161 


610 


0LGRFX/ i ODTVAIRKVKEVFG'IGA>5RKWlLFTHkED*GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGfVEEQRQQOAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRCYOAKVE 
WOVEKJIXOELRENESNWAYKALLRVKHLKLbHYEI FVFLLLCSI 
LFFIIFLF 


S707 


-;e 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHGGRAEPEPDAPKRAPLKR* 
MFAlQPGl^GGQFLGDPPPGLCQPELOPnSKSNFKASAKDANE 
NWHGMPGRVEP3LRRSSSESPSDKQAFQAPGSPEEGVRSPPEGA 
EI PGA EPEKMGG AGTVCSPLSDNGYAS SSLS 1 DSRS SSFEPACG 
TPRGPGPPCPLLPSVAOA 


5708 




1 1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGOPRPYLDLPA 
Q ASVS R PKDRA* G EAVS LSLSSGDVCG H T DGGG AGS D POA K P KP 
PRCPFTAMPSPRTKQKVRNKVCLL2AIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA* LHRKAGAGSLCLSASLLPPSFSLGA PGAPS PL 

LLK* S D£ PVKQLPA\SG0GSGA3MPPVGS SDILR PR PTSVSGTG 
RAAG* CSWQPAACCTPRSQ* WAVARSPSRCSRW* RQSGR*RG*S 
SRRRRGP* AvAGRSTPAVP*PCS*GGAGRRAYACRTGWGYAPSR* 
LEP5G PTSGSAL * TWASHSTGA* •SRLCGTAGTGPLCSQSSRS* 
AG * RCCCTAAS PCGGSGPSHPGSPSAHCLS WSGGRT0 PRAPS AH 
GRGRAMGSRCVCTCTGLPCPGIPLSGASPGGSGETGAGRSKTLK 
AARSRXS PRPGSGSRGSY* SHNDNWGTWPAPPSAGHLLVGG ♦ NS 
QRTSSDH * YTGTRR PWAGPGTRCSTAPSR^iAPPVSR CRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPA5S/GEAPA 
PPPRPEFPPPPARRP 


5709 


2 


2C31 


ITI^PLPOTEKClJmn-EMTPi^IYLKA^VEAGGLKELElSWG 
I>HQ1WR VJGAWMRAGMGGCRC^GVMAPFAPH/N ALS FLVN DCS 
LlHNN^C^lAAVFVDRAGEKKLGGLDYMYfAOGPJGGGPPRKGIPE 
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FNSDCOD: <WO 0153312At_L> 
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SEQ 
ID 
NO; 


Predicted 

beginning 

nucleotide 

location 

cot responding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Amino acid segment cent a in ing j>-«nol pcptir.c^^ 
(^--Alanine. OCysteine, D^Aspar t i c Acid, E- 
Giutamic Acid, F=Phenylalanine, G=.-(?lycine , 
K-Kastidine, l=lsoleucine, X=Lysmfe. 
L= Leucine, M«Me thionir.e , N-Aspnr?.gme . 
P=Proline, 0=0 j utamine , R=Argir.me, 
S=£erine, T= Threonine, V=Valine, 
W^Tryptophan, Y^Tyrosine, X>=Unknown, *-Stop 
Codcn, /^possible nucleotide deletion, 
\»pcssible nucleotide insertion) 








LEQYDPPEIiADS£GRVVREKRSADMWKbGCl.]WEVFNGPLPRAA 
ALRNPGK1PKTLVFHYCELVGANPKVKPNPAHKLCNCRAPGGFM 
SNHFVcTNLKLEHlClKEPAEKOXFFOELSKSLDAFPEDFCKHK 
VLF0LLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQK3 1 PVWK 
MFS S T DRAMR I R LLQQMEQF 1 Q Y LD E ? TVNT0 1 P P HWH6 FLDT 
NPA1R EQTVKSM LLJLAPKLNEANLNVELMXH FARL0A XDEQG P 3 
RCNTTVCIjGKIGSYLSASTRHRVLTSAFSRATRDPFAPSRVAGV 
L/GFAATHNLYSMNDCA0K1LPVLCGLTVDPEKSVRDQAFKAIRS. 

flskle3vsedptqleevekdvhaasspgmggaaaswagwavtg 
vs £ l*;s k l 1 r s hpttaptetn i pqr pt peg vp a paptpv pat pt 
tsghwetqeedkdtaedsstadrwddedwgsleoeaesvi^ood 

DWSTGGQVSRASQVS \TPTTMPPN PQS PTGAAGK\ RGLLGTGLA 
GAXLPGATS * R YTAGORV 


5710 


1 


562 


IPGST1SCEVELMAJ?KAKTIDSFTQN0TRJLWJ IDGbDACECDK 
VLQMLDTVRVLFSXGPFIAIFASDPHI I IKAINONLNSVPSGFK 
\LNGHDYMRNI VHbPVFLNSRGL/RQ/LQENFS *LOOOMETFHA 
01 LOG Y RKKLTEEFH RTALGR * ON LV A^OPS 3 DG * DAI G FELYV 
CIAI0FNTNKDDAT 


S711 


1526 


2130 


krkpfowttvtoeafskhdvaftetpvlfypdsaopfivksess 

SQ1 AKAVX>S0OR PSLFHECAFHFFS * SLQRHT 3 NLD0G3 P* LL»M 
LGEERQHLFESS/I WTTPHNLK* / FEIHEHLC-SHEGHWTIjFFLL 
OIL 


S/12 


3 


1391 


GRKLFOSLD3SERLKFLLTLDCVDUTL1VLAEEHGCLDI3KELP 
ETV 3 D) .T.NKCLT FH PS KRPTPDELMKDX VFSEVS PLYTPFTX PA 
SLFSSSLRCADLTLPEDISQLCKD3WDYLAERS3EEVYYLWCL 
AGGDLEXELVNXE13RSKPPICTLPNFLFEDGESFGQGRDRSS/ 
TFR* YHWDIWMPAKK* 3ERCWGRS3 LPITLKMTSL3 LPYSNSN 
NELS AAATLPLI I REKETEYQLNR 3 1 LFDRLLKAYP Y KXNQI WX 
EARVD3 PPLMRGI.THAALLGVTSGAIHAK YDA3 DXPTPl PTDROI 
EVD3 PRCHQYDELLSSPEGHAXFRRVLXAWVVSHPDLVYWQGLD 
SLCAPFLYLNFNNEALVYACMSAF3 PKY LYNFF1,KDNSHV3QEY 
LTVFSOMIAFHDPEJuSNHLNEIGFlPDLYAIPWFLTMFTHVFPL 
HX3 FKLWXlYrXLIXEFLFPI LYWE 


5713 


634 


284 


PVCAVPVDRWPVLPREDQEGOQL*AXLPRDFRf< * FQI LGPMEGH 
TACRCS R RGAQVQHL PRED3 RAAE * DPH LRE VW PGLPTSSATS P 
♦ RAVLTS PCSHLGS ADAASSH WLCGVS FH 


5714 


212 


613 


WGLGLGPTMSSLGGGSOCAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS AWAAAAPAS VADDTPPPER RNXSGI 3 £ E PLNXS LRRSRPLS 
K YSS FG SSGGSGGGS MMGGES ADKATAAAAAAS LLANGHDLAAA 
MA 


5715 


131 


1973 


ESASCOXRSKCLILTLKLELSGSAPXKTSARPGSSLWLPPHSQE 
QTPPAS KloOGGGGGLQTGKGLHP VPVTAAS PLPR WCLFGAVAK\ 
GLPGP * LCPSGAA/GGLORGPGLSPLGAAGKVSCLHPPSMVENN 
DSTCHEH«EGILAARVTPVP\SGXPGRVLXPPGRVCRPPHPAAS 
PK.PPGS /SDLDGPRPOMHLRAFPAAHGGPVNTPHGGEEXTFMSS 
01 RRKETKPL* RKTPAG\NNYQSNSIPVSOSPOLT\T)LLPSAGR 
TOAPSGRGDAGXPTPGHG\LPXASVILTPNCPCSLAGGO*RRGL 
YPXTPK0RRWRRPL/LLGPSQ*GSR0STC*EV\GALGEPVR3PG 
L* PDLSCI LSNGSXHRREGLSFPRSLGPGRRGPAGLOSIjGCSPT 
PXNTACH <5 <tf5HVALO AG HES AR DVGSGHVALO 'KirHD S T OD VGR P 
VWRW3 FLE * LGLSR ETGQATRRGLVWI SPGRAAAAC VACAQALE 
EGPLRLPGQDRGAOPCSHCPGRAAGQFEPGAGAPCRE/GG*DPT 
GLT/GVPGTDPKRGGRXPGOSGQETQGPTVWSGPESPLQPKP*E 
ROE / VGAGASSGVGLS RGRAGGPS SAWEVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RV FS LLCEGPGHCYQGAVCREACAAAS PGLDSAAEFHRLCEHTD 
*LPX*GPGYIOHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL^SGFFTI I VGGYSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEiGKCGPRGLVPEGESTSPLPSSVirrEDSLD 
EGPGALVLESDLXLGODLEFEEEEEEEEGDGNSDOLMGFERDSE 
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SEC 
IV 
NO: 


Predicted 
beai nninc 
nuclect i rie 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino ocic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alar)ine, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F=PhenylaJanine, G-Clycine, 
HsH.istidine, I=Isoleucine, K*Lysine, 
L=Leucine, Methionine, N=Asparoginc , 
P=Proline, C^Glutamine , FUArginine, 
S=£erine, T= Threonine, V^Vaiine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion) 








GDSbGARPGbPYGLSDDESGGGPJVLSAESEVESPAKGPGEARGE 
R PGP ACQLCGGPTGEGPCCGAGG PGGGPLLP P RLLY S CRLCTFV 
SHYSSKLKJo4MQTHSGEKPFRCGRCPYASA0bVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHORTHAGPPTPPCPTCGFRCCTP 
RPA3PPSP7EQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 
0\ CGVKGRASAGLDQNHCOS/ SLFPWTCRGCGOELEEGEGSRLG 
AAMCGRCWKGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
KHMKTHSGEKPFRCARCPYASAHLDNLKRHORVHTGEKPYKCPL 
CPyACGNI^ANLKRKGRIHSGDXPFRCSLCNYSCNOSMNLJRHM 


5718 


120 


284 


VAHALSLPAESYGNDVSMTHPOLPPTOLAWDLCRTCLPl^YNFT 
S* *STADPLHL 


571 9 


A 8 


426 


ELNNG PFOMPLCNGGNLAVTGS WADR S PLHEA ASQGR LLALRT1> 
LS OGYWNAVTLDHVTPLH EACLGDHVACAR7LLEAGANVNAIT 
lDGVrPLFNACSQGSPSCAELLLEYGAKAOP\ESCLPSP 


57 20 




1051 


LQ A FR NAS E V PM VL VGTQDA I S AA \N P R V Y RRTS RAR K l> S TDLK 
\RCT\YYE\TCGGTYGUJMWSVSFQDVAOKWAIj\RKKOO\LAI 
G PCK\SLPN\ S PSH\S AVS AASI PARAP I NQGHE /SGGGSAFSD 
Y \ S S SVPS TP S I SQRE1.RI ET I AASS TPTP 1 R K0S KRRS N I FTS 
RKGADP\DREKKAAGCKVDSIGSGRA1 PI KQGI LbKRSGKSLNK 
EKKKKYVTLCDMGLLTYHPSLHDYMQNI HGKEI DLLRTTVKVPG 
KRLPRATP ATA PGTS PRANGLSVER SN 1XJLGGGTGAPH S AS S AS 
L H S ER PLSS S AWAGPRPEGLHCR SCSV S SADQWS E ATTSLPPGM 


5721 


97 


492 


rhsspccslrrterssnaavst/ttvoofkrfienyrrhigcva 
vfyaiagglflerayyyafaahhtg: tdttrvgi ilsrgtaasi 

SFMFSY I1>1»TMCRN1j1TFLRETF1»NRYV PFDAAVDFHRLI AST A 


5722 


88 


1043 


VAXDVljAGSSPGGGMAGALLGPRVHGl ravlrvarggvqapgap 
g5i,gvshaaapparpogaaosphrgrr:-?ggggaglppprsprfp 
cesvpaststargprrvsrrlppohpgprgrrrrpgagvgaprr 
grargoagllgrcxsoggrgaereraaloarrgrrpgpepdqs cg 

GR P RRAAAA PGRAPADPQ PPAPR F AP A ? JDVR FP AD APAP AP AP A 
PPPPPHLGALTAGSGEEROSOPRAETLRLGRGAPLPXPRAERGG 
RPKOAECQQNPKRPTPPARGPOSSGDPAMLPORAGLRTGGLAGT 
KSSTREIPEMI 


5723 


88 


104 3 


valdvjlagss pgggmagallg pr vhg 3 ravlrvarggvqapgap 
gslgvshaaapparp0gaaospkrgrr«ggggaglppprsprfp 
qes vpa s ts takg p.3rvsrr lp pqkpg prgrrr r pg ag vgap rr 
g rarg0 ag llg rqgqggrg aer eraalgarrgr r pg pe pdqs cg 
gr pr raaaapgrapadpqppaprpapapdvtr ppadapapapapa 
ppp pphlgaltagsgeeros 0?raetlrlgrgaplp\ praergg 
rpkqaecxxApkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 


5724 


3 


1841 


FTNEAPPAPLFDASASPLSPHRRAKSLDRR3TEPSVTPDLLNFK 
KGWLTKQYECXJOWKKHWFALADOSLRYYRDSVAEEAADLDGBID 
LSACYDVTEYPVQRKYGFQIHTKEGEFTLSAMTSGIRJWWIQTI 
KKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTEKQEAELGEP 
DPEOKRSRAREXRRREGRSKTFDWAEFRP I QQALAQERVGGVGP 
Ar>TH\DPVmPEAF^GELFJRERAKRREEPJ^KRF^MLUATCX5PGTE 
DAALRftEVDR S PGLPMSDJ .KTHNVHVE 1 EOR WHQVETTPLREEK 
QVPI APVMLSSEDGGDRbSTHELrSbLEKELEQSQKEASDliEQ 
NRLLCDQLRVALGREQSAREGYVL0ATCERGFAAMEE7HQKKIE 
DLQR0HQRELE KLR EEKDRLLAE ETAAT 1 SA1 EAMKNAHREEME 
RELEKSQRSOISSVNSDVEALRROYLEELQSVORELEVLSEOYS 
QKCLENAl«AOftLEAERQALRQC0RENOELNA^0ELl^LAAE 
I TH LRTLUTGDGGGEATGSPLAQGKDAYEIjEVPSGAR pcltqlc 
T0EPOGSAAHPLS Y R WGGTDLROQESQGPGRS KSPEGGEEQ 


5725 


3 


104 9 


VNGHSEETSQSPNKTEPHDSDCSVPLGISKSTEDLSPQKSCPVG 
SWKSHS 1 7NMEIGGLKI YDILS DN\DLSSHLQPLK/ FTSAVCG 

iotivrskaatllydqplqvftgsssssdlisgtkaifkfdsnhn 

PE/GAKYNW??HKWAHNLHLKYMVLHS1 1 SNTVAV\RSQRHFVA 
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Sr.; 

NC : 


beginni nc 
nucleotide 
1 oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
1 oca t i on 
corresponding 
to first 
anil no acid 
residue of 
amino acid 
sequence 


toino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phony] alanine, G=Glycine, 
K=Kistidine, I--Ieoleuc:ne / K*Lysine f 
L-Leucine, M-Kethionine , N-Asparagine , 
F^Proline, 0=G3utamine, R-Arcinine, 
S=Serine, T=7hreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Cocon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








L0TKSPNRPCQPSSSAPS/VD0HAQ/1N0SVAKHSANMNFSN>JN 
NVRAN TA Y HLHQRLG PARHGEKWA I S PNDRL* PAVTRSTI QRQS 
SV£STA£V7*1X3DPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQKPLSARTYSIDGPNASRP0SARPSINE1PERTMSVSDFNYSR 
1SI 


| 572 C 


2 


486 


> SRSLSMWVn'NSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPLGAP 
TRKAG VTP CI LGPLEAGLFFPG SGGV I TL / ESVGAGI PGPSRAG 

: QGS PG GSG EG P PLSS PSQPLPADLPG ATLPDVGLELEVR PLAVT 
GLI FK LGQARTPPYLOLQVTEKQ VLLRADDG 




21 


221 


RPILI LKETRRLPWATGYAEVINAGKSTHNEDOASCEVLTVKKK 
AGAVTSTPNRNSSKRRSSLPNGE 


5?; e 


2 


877 


GTR NGQFE PRRGRAWEGSAGGLRAPGAAAGGPG VQPRGSG/ LPG 1 
NA1 RAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR \ PVI AGGRP PW PVSG V1#G SR VCG PLY S TS PAG PG / SGG1>S PSO 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRW PTS /G PPSQ IGEGAEQGDEG VADAPO 1 QCKN/G AEDPPAEC 
EPPOVPEAGEEDAVPAEEGPGGTPETOADOVRERPEAHLAEGGA 
KGSPRRLADPQDLPAG0MSLAPPFPPVAAV1RSNK 


572 i- 


1 


1525 


AGGAK hi Vl^'JLO 1.GHFAGF VGAHWWNOODAALGRATDSKEPPG EL 
CPDVLYRTGRTLHGQETYTPRLjLMDLKGSLSSLKEEGGLYRDK 
QLDAA I AWQGKLTTHKEELYPKN PYLQDFLS AEGVLSSDGVWRV 
KS1 FNGKGSSPLPTATTPKPLIPTEASI RVWSDFLRVHIjHPRSI 
OHIOKYNHPGEAGRLEAFGOGESVLKEPKYOEELEDRlOiFYVEE 
CDYLOGFQI LCTJLHDGFSGVGAKAAELLQDE YSGRGI I TWGLLP 
GF V H RG EAORNI YR bLNTAFGLVH LT AH SSLVCPLSU3GSLGLR 
PEPPVS FPYLHYDAl*liPFHCSAlLATAl>DTVTCS\YRLCSSPVS 
MVHlA ADMLS FCGK KWTAGA 1 1 P F PLA PGQS L PDS LMQFGG AT 
PKTPLSACGEPSGTRCFAQSWLRGIDRACHTSOLTPGTPPPSA 
LHACTTGEE I LAQY UQQQQPG VMS S S HLLLTPCRVAPPYPHLFS 
SCS P PGMVLDGS PKGAAVES VPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLAN00VFK2SCFRCSYCNNK 
I>SLGTYASLHGR1YCKFHFN0LFK5KGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENCGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSS S R EKG3QAS WN PKLR VA 


5731 


122 


4 4 3 


RSHRGEL1 PKDSCYMRXPPRRPKKRRQG/CALPQGCLTFKDVA1 
EFSLEEWKCLNPAQRALYRAVMLENYRNLESVGLTSKDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


573. 


226 


772 


PPSRSCOSPRRKSRRRAKVTVTLVCGFTSFSFSLPLYLCGCLRF 
PERTCSQlrQCADWAPDFGPSSFVPSWGATATGARKFLIAFWI\N 
LLGTK E0A>1R3 ALNLR E0GRGKDOPGRLXXV0GI G WYIiDEKNLA 
OVSTNLLDFEVTALHTVYEETCREAOELSLPWGSQLVGLVPLX 
ALLDAA 


573 > 


1 


460 


PALOEVWAMALAWGKOYENDARTLFEFTSGVNDTESPI I YRDES 
MRTACS PDGuCSDGNGUEuKCPFTSRDFMKFTlUX5FEAlKSAYM 
AQVOYS MWVTRKNAW Y FAN YDPRMKREGLHYVVI ERDEKYM\AS 
FOE I \VP\EFIGKMDEVLSRDPM 


5734 


3 


966 


RCNSPESLTSLL»VL1»TTANNLFVLI PAYSKNRAYAI F?I VFTVI 
GSLFLMNLLTA2 1 YSOFRGYLMKSLQTSLFRRRLGTRAAFBVLS 
SMVGEGGAFPQAVGVKPONLLQVLQKVOLDSSHKOAMMEKVRSY 
GSVLiLb AfcCi r yK LFNELDKS W KLH r PKPE YQS P FT^SAQFIjFG 
HY Y FDY LGNlil ALANL»VS I CV FLV LDADVLP AERDDF I LGI LWC 
VFIVYYljLEMIXKVFAl^LRGYLSYPS^FTXniLTVVLIjVLEIS 
TL\VCTDCHTQAGGRRWW/RLLSLWDMTRMLNMLIVFRFLRIIP 
SMK PMAWASTVLGL 


573i 


2 


54C 


FFTPCVAJ^FNFPDQATVKXAAYSLFRVGGGTSCGLPQARRISL 
ATPROLYK/SSNMTORWQRREISNFEYLMFLNTIAGRTYNDLNQ 
YPVFPW VLTNYES EELDLTLPGN FRDLS K P I GALNPKRAVFYAE 
RYETtfEDDQSPPYH YNTHYSTATSTLSWbVR I VS I FI EIACLWY 
LKILT 


573 0 j 


382 


GTRPSTKKSGYSPQQVAVIHCKGHOKENTAVAHSNQKADSAAQV 
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SEQ 
ID 

NO: 


Predict. t>c 
beginni nc 
nuclect. i 
iccatiot. 
corresponding 
to first 
amno acid 
residue of 
arr.ino acic 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containing signal peptide 
{A=A3onine, OCysteine. D-Aspartic Acid, e^ 
Glutamic Acid, F=Fhenylal anine, G=Glycint-, 
KsHisti cine, 1 = 1 sol eucine, K^Lysine, 
L=I,eucine, W=Met hi cn me, A-=A£paragme, 
P=Pro)ine, O/^Glutarrine , R=Arcjinine, 
S^Serine, T-Threoninc, V=Valine, 
W=Trypt opban, Y- Tyrosine, Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possii))e nucleotide insertion) 








TARLrSVT P PNLLPTV S FPQ PDLPDNPV Y STTTEKLAS DLRANKN 
CES* * 1 L PDSG I F3 P * T* TS YLQSTTHLRRAKLPQIiLRR 


5737 




1043 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRANLGPCRRKR 
LQTLMPLAAG FQYSSHKDPS LSAKEKHTDYHNEARGP WPGWVG * 
RTADGSCGRGP DG AHH PG PKS S SW RAS RLLPGl/GGSHHLDAYVG 
RDLECGTFA?LObElPPOPKGHPAPlPTGQAGPRDSGPG.^SP*V 
ETRPLTDORR*PGVRPVGWTPATIPAGTLRPRGAVEPSVSACGKW 
A PS PTSQGCCEGRCDAVPKJ jRA WRTPLCSQ 


57 3* 


it 




DTLSLNCTLPETLPMTPSF* LSFL* FPGLARAKSIPTKTYSNEV 
VTLWYR PPDI t»U5S TDYS TO 1 DMW * GQVEVWQGPCG KGGGLVT7 
ATQ?AAFL»FTVPS1jPRGVGCI FYEMAIGRPLFPGSTVEEQliHFi 
FRI^S EE AWAL.CAVETHK 


S73S 


j 


1222 


SFQRRGI RWNVHTbHPHPRAVWAGlGRGHGS* ALLGRARAPALC 
FPTLLEFLKSLEPDLPALRAKGLHLWAAGPGTHPAG1SDLLAEV 
SAEVDCPVPG YLSS P0S I TDTCLY I FTSGTTGLPKAARI SHLK1 
lqcqgfyqlcgvhcedvi YLALPLYHMSGSLLGIVGCMG igatv 
VLK S KF£ AGQ FW ED CQQHR VTV FQY I GE LCR Y LVNQP P S KAERG 
iiKVRLAVGSGLRPDTWERFVRRFGPLOVLETYGLTEGNVATINY 
TGQRGAVGRAS WLY KH I FP FS LI R YDVTTGEP I RDFOGHCMATS 
PGEPGL1VAPVS0OSPFLGYAGGPELAOGKLLXDVFRPGDVFFN 
TRDLLVCDDOGFLRFHURTGDPFRWKGENVATTEVAEVFEALDF 
LQEVNTVYGVTV 


574 0" 


26< 


231 


PAYWLKVPTLCLESKTDLREKASHVSAOLQGEVRGI*AGALWM*A 
YVYERVYN*NISRhTVKAl»EOKR}IPAGl>SSSMAI^l^PCljGMLMA 
LQSELHKLYDEETQSKVSGSACGGYP 


S741 


1 


650 


PRKTMRRGVLMTLLOCSAMTbPLWlGXPGDRPPPLCGAIPASGD 
YVAR PGDKVAAR VKAVDGDEOW1 LAE WS YSHATNKYE VDD1 DE 
EGKERHTLS RRRVI PLPQW KANPETDPE ALFQKEQLVLALY PQT 
TCF Y RALI HAP PQRPQDDYS VLFEDTS YADGYS P PLNV/^QR YW 
ACKEPKKK * CRLADSPSPNDTGODSRGRAGIKHI PPLKKK 


574 2 


2 


36 2 


T0S VK E 1 LKRNPNVNLTDKDGNTALM I ASKEGHTEI VQDLLDAG 
TYVMI PDRSGDTVlilGAVRGGHVEl VRAbLQKYADIDI RG0DNK 
TALYWAVEKGNATMVRDI LOCNPDTE I CTKDG 


S743 | 2 

1 
I 


415 


GKTPEG1DAI EE1EI DLEETERE1SPQENGLEEVKPLGEMQTDL 
KATGRETS FREKTPEV1DATEE1DKDLEETGRREI SPEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAAEV1ETDLEETERE 
ISPQE 


574 4 


j. . 3 


703 


TRRTTTTSPTTTRQMTITPAA1.PTTVVTTPDLTTGTPL0MTT1A 
VFTTANTCL5LTPSTLPEEATGLLTPE PS KEGPI LTAESETVLP 
SDSKSSAESTSADTVLUTSKESKVWDLPSTSHVSMWKTSDSVSS 
POPGASDTAVPEONKTTKTGOMDGI PMSMKNEMPISQLLM i j Ap 
S I GFVLFALFVAFLLRGKLMETYCSOKH TRLDY1 GDS KNVLNDV 
QHGRE DEDGLFTL 


S74S 


140C 


59S 


G KSRFVNLMKHS KKTYDS F0D ELEDY 1 KVQKARGbEP KTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDORLPSETIOTYPRSCNIPOT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFWCX)E 
YICGSHGVEHRVYKHFSSDNSTSTHOASHKQIHQKRKRHPEEGR 
EKSEEERSKKKRKKSCEeiPLDKHKSIQRKKTEVEIETVHVSTE 
KLKNR KEKK SRDWS KKEER KRTKKKK EQGOERTEEEMLWEOS I 
LGF 


5746 


3 


821 


S?ASGRLTPSSPAFI>GELDK)RYSNGPAVSAWSLGMGAVSWSES 
RAGERRFPCPVCGKRFRFNS I LALHLRTHQPERPRSPAARLLLE 
LEERALLREARLGRARSSGGMOATPATEGLARPQAPSSSAPRCP 
YCKGKFRTSAERERHLH1LKRPWKCGLCSFGSS0EEELLHHSLT 
AHGAPERPLAATSAAPPPOPOPQPPPQPEPRSVPQPEPEPQPER 
E ATPTPAPAAPEE P P APP E FR CQ VCGQS FTQS WFLKGH MRKHKA 
SFDHACPV 


574 7 


2 


1328 


DRHVETLC 1 HFLG PS TGSTAKTGGRNWLXTGN CX»YGNT CR FVHG 
PSPRGKGYSSNY^RSPERPTGDLRERIKNKRODVDTEPOKRKTE 
ESSSPVRKE5SRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEC 
ID 

NO : 1 


Predicted 
beginning 
nucleot ice 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
cequencc 


PrecKifc end 
nuciec t ice 
locu t : cn 
COT I e> uundxjics 
to first 
amine ;»cid 
resic.;p cl 
amine acid 
sequence 


Ammo acid r^cc^ent containing siynal pept ide n 
{A^Alanine, C=Cyst cir.e , D-Aspartic Acid, E = 
Glutamic Acid, F^ Phenylalanine, (^Glycine, 
HsHistidine, 1 = 1 soieucine , K-Lysine, 
L= Leucine, Met hior. ine, N-Asparagine , 
P=Proline, C= Gl ut am; ne , R=Arginine, 
S^Serine, T- Threoni ne , V=Valine / 
W^Tryptophcr,, Y^Tyrosinc, X=Unknown, * = Stop 
Codon, /*-por-sibl e nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDI mDyVr^LSLEMKRCKIORELMKbEOENMEKREEI 1 1 K 
KEVSPEWRSKLSFSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 
AVSSPLLDQQRN<fKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
KYKVKDRIEEKT.^DGKPRG^DFERQRSKRDKPRSTSPAGQHHSP 
1SSRKHSSSSQSGSS1QRHSPSPRRKRTPSPSYQRTLTPPLRRS 
ASPy?SHSLSS?CRKOSP?RHRSPKREKGRHDHERT£OSHDRRH 
ERREDTHGKRDF.FKDSREEREYiQDOSSSRDHRDDREPRDGRDR 
RE 


5746 


93 4 


473 


SEGPQVFYKGl^PTLlAlFPYAGLQFSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGfGAGVISKTJLTYPLDLFKKRLQVGGFFJIARAA 
FG0VRRYKGLMDr/U<OVL0KECALGFFKGLSPSLLKAAl»STGFM 
FFS YEFFCNVFF. CMNRTASQR 


5749 


552 


j 


GFPVDPRVRGSTLSLAERPKC-MIRSGSFRDPTDDVHGSVLSIAS 
SASSTYSSAEERt'-OSEOlRKLRRELESSQEKVATbTSOLSANAN 
LVAAFEQSLVNKJSRLRHIAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVIQGAU^ASETTPKELRIKRQNSSDSISSLNSITSHSSI 
GSS KDADA 


5750 


22 


I 6 (> 


I F I S J CLWNAH LCFLLLP KDCI DO VMKLONLFVDDSGR Y LA I Q F 
! ILE WAYVFLYY YEY R KAKDQLDI AKDI SQL03 DLTGALG KRTRF 
0ENYVAQLILDVRREGDVLSNCEFTPAPTPOBHLTKNLELNDDT 
ILND1 KLADCEQf- QMPDLCAEEIA1 1LGICTNFQKNNPVHTLTE 
VELI^FrSCLLSOPKF^AlOl'SALILRTKLEKGSTRRVERAMRO 
TCALADQFEDKITSVLERLKIFYCCOVPPHWAIQROLASLLFEL 
GCTSSALOIFEKLEMWE 


575 3 


3 


751 


SCGSALRAWRCGAAALiATFPAPALPGLMYRALYAFRSAEPNALA 
FAAGETFLVLER f S AHWWIAARARSGETGYVPPAYL,RRLQGLEQ 
DVLOAIDRAIEAVHNTAMRDGGKYSLEORGVLQKLIHHRKET^S 
RRG P SAS SVAV M" SSTSDKK LD AAAARQPNGVCRAG FERQHSI.P 
SSEHLGADGGLFOI PLPSSCI PPQPRRAAPTTPPPPVKRKDREA 
LMASGSGGHNTKPSGGNSVSSGSSVSSCI 


5752 


1 3 


471 


GP V CG VG LS VA W AG P WRG P VH S VGGGG RAALHG AE L PCLS G AAT 
VEREMELRHKNE^LRVETEARARAKAERENADIIREQIRLKASE 
HROTVLESI RTAGTLFGEGFRAFVTDRDKVTATVNl FIKCGWQV 
AERCWGASWSPRSCPCRLCTAL 


5753 


34 


4E3 


DDSXA1 PGGVQAI FGAVRNI YTPRTGHRI RKJUDQ1QSGGNYVAG 
GQEAFKKLNY1.D? GEI KKRPKEWNTEVKPVIHSRINVEARFRK 
PLOE PCT I FLIAAGDLJ NPASRLLI PR KTLNQWDH VLQMVTEK I 
TLRSGAVHRLYTLEGRLV 


[ £754 


14 


i-aa 


TLVH WE FAGEHASA I ASREQ EVLQGW KELl/SACEDARLHVSST 
ADALR FJIS Q VRDLLS WKiDG 1 A SOI GAADKPRCPS SLLGLPAS PW 
WPTPATPSPLTA F FS ME 


5755 


3 




LGDQFY KSA1 EKCKS YWSRLCAERSVRLPFLDSQTGVAQKNCY I 
WMEKRHRGPGLAVGOLYTYPARCWRKKRRLHPPEDPKLRbLEIK 
PEVELPLKKDGFTSESTTLEALJjRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDIPKRKNRTRGRARGSAGGRRRHD 
AAS0EDHDKPYVCP1CGKRYKLNRPGLSYHYAHTHLASEEGDEAQ 
DQETR SPPNHRNEKHR PC\KGPDGTVT PNNYCDPCLGGShfMNKKS 
GRPEELVSCADCGRSAHLGGEGRKEKEAAA 


5756 


3 


t21 


SS KLQALF AH PLY N V P F. E PP L »LG AEDSLlASQE ALft Y Y R R KV AR 
WNR RH KM Y R EQMK LTS LD PPLQ LRLEA S WVQFHLG I NRKG LYS R 
SSPWSKLLQDMRKFPT1SADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMR00RD2ETPVDFFYFIDFQRHNAEI 
AAFHLDR I LDFRR V PPTVGR 1 VNVTKEI L 


5757 


3 


4"/3 


YKDALLLPDNHRCWFENGTLKLTDVQKGMDEGEYLCSVLIOPQ 
LS I SQSVHVAVKV F PL I OPFE FPPAS IGQLLY I PCWS SGDMPI 
R I TWRJ0X5QVI I SG SGVTI ES KE FMSSLQ1 SSVSLKHNGNYTCI 
ASNAAATVSRERO^I VRVPPR FW 


S758 


1 


414 


FRRGAGAERGEHRFGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNOSNRLAVSRTDGTVEIYKLSAKYFOEKFFPGHESRATZALC 
WAEGQRLFSAGLKGE I MEYDLQALNI KYAMDAFGGPI WSI4AASP 
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BNSDOOD: <WO 01S331?A1 J_> 



WO 01/53312 l'CT/liMm/34263 



SEQ 1 
ID 1 

NO: 1 


Predict ec \ 
beginning j 
nucleotide ! 
location I 
con cspor.rlirio 
to first 
cimino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
i oca t j on 
corresponding 
to first 
amino eicid 
residue of 
;:rrnno acid 
sequence 


Amino acid e'eo-cnt containing signaj peptioe 
(A-Alanine, OCysttine, D=Aspartic ^,cid, h- 
Glutamic Acid, F=Phenylal anine , C-G\yc\n^ , 
H-Histidiue, I -I sol eucine , K=Lysine . 
L=I_»eucine, MsMe:.hi onine, N=Asparaci r.t , 
P=PrcIine, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«=Valine ? 
vi=Tryptophan, Y* Tyrosine, X^Unknowri, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucieotide insertion) 






SGSQUA'GCEDGSVKbrQlTPDKIPV 


S759 


2 

... J 


1240 


GNAAFAGUGVVYETrKMSDLPSYTTNGTVHVVVNNC 1 GFTTDPR 
MAR SS F Y PTDVAR VY74AP 1 FHVNADD P EAV I YVC£ VAA EWRNTF 
NKDVGADLVCYRRRGHKEMDEPMFTQPLMYKQIHRCVPVLKKYA 
DKLIAEGTVTLQEFEtEI AKYDRI CEEAYGRSKDKK1 LHI KHWb 
DS PWPGFFWVDGEP K S MTCPATGI PEDKLTH 3 GSVA S S V ?LE DF 
KiHTGLSRILRGRADMTKNRTWWALAEYMAFGSLLK-FGlHVRL 
NGODVERGTFSHRKHVLHDQEVDRRTCVPMNHLWFPOAPYTVCN 
S SLSEYG VW5FELGY AJ4ASPNALVLWEA0FGDPJINTA0C 1 1 DQF 
J S TGQA K W VRHNG I VLLLPHGMEGMGPEHSSARPEK ¥ LQMSNDE 
SDAYPAFTKDFEVSQL 


576C 


1 


1 221 


vrditsdslslgw'ivpegofdkflvqfk^gdgopkavrvpghed 
gvti sglepdhkykknlygfhggqrvgpvsavgltap gkdeema 
fastepptpeppikfrleeltvtdatpdslslswtvfegofdhf 
lvqykngdgopkatrvpghkdrvt: sglepdnkykknlygfhgg 
crvgpvsa3gvtaaf.eetptptepsmeapeppeepu.geltvtg 
55s pdsls1 >swtvpqgr fds ftvqykdrdgrpqwrvggeesev7 
vggbepgrkykmhlygl.hegrrvgpvstvgvtapoeev'detpsf 
tepgteapeppeepllgeltvtgsspdslslswtvpogrfdsft 

VOYKTJRDGRPQAVRVGGQESKVTVRGbEPGRKYXMHLYGLHEGR 
RLGPVSA3GVT 


5761 


3 


1275 


S C DMAE AAALVW 1 RG I- G FG CKA VRCAS GR CTVRDF I HK H CQDON 
VPVENFFVKCNGALI NTSDTVQHGAVYSLEPRLCGG KGGFGSML 
RALG AO I E KTTNR FACRDLSGRRLRDVNHEKAMAE W VXQQAERE 
AEKEQKRLERLORKL.VFPKHCFTSPDYCOOCHEMAERLEDSVIjK 
GMQAAS S KMVSA E 1 S ENR KRQWPTKS QTDRGAS AG K H R CFWLGM 
EGLETAEGSNSESSDDDSEEAPST5GMGFHAFKIGSNGVEMAAK 
FPSGSORARWNTDHGSPEOUJIPVTDSGRHlLEDSOvELGESK 
Ef JKES RMVTETEETQF K KAES KEP 1 EEE PTGAGLN KDKETEER'J ' 
DGERVAEVAPEERE?^VAVAKi0ES0PGNAVIDKETIDLLAF7SV 
AELELLGLEKliKCEl^TAliGLKCGGTLQ 


5762 


2 


344 


GSTGOTPLHSOGGGGGSGGGRRRTPRGMPKEKYEPrDPRRMYTI 
KSSEEAANGKKSHWAELE3SGKVRSLSASLWSLTHLTALHLSDN 
SLSR J PSDI AKLHNLVYLDLSSNKIR 


S763 


3 


4 29 


LDKJDTGL1 HL1ARLDYEL1QRFTLT3 3 ARDGGGEETl GRVR3NV 
LDVNDNV p T FCKD AY V G ALR ENE PSVTQWRbRAT DE 1- S PPNNO 
1TYSI VSASAFGS YFD J SLYEGYGV 3 SVSRPLDYEQ 3 5-NGL3 Y L 
TVMAMDAGN 


b764 


19 


441 


VCARACGEMRQLLRPI DRQRYDENEDLSDVEEIVSVRG F SLEEK 
LRS0LY0GDFVHAMEGKJ5FNYEYV0REALRVPLI FREKDGLGI X 
MPDPDFTVRDVKLLVG5RRLVDVMDVNT0KGTSMSMSCFVRYYE 
TPEAQRDXL 


5765 


3 


82 S 


OK 1 LRLNNS HOPPTS S SMSKDCGGPASSGAGATAAIjADGLKFAS 
V0ASAP0GNSHKETSKSKVKRSKTSKDANKSLPSAALYGIPE1S 
S7G KRQEVOGR PGEATG MNSALGQS VSSGGSGNPNSNF TSTSTS 
AATAGAGS CG KSKEEK PG KSOSS RGAKRDKDAGKS RKD KHDLLQ 
GHONGSGSOAPSGGHLYGFGAXSNGGGASPFHCGGTGSGSVAAA 
GEVSKSAPDSGLMGNSKLVKKEEEESESHRRIKKLKTEKVDPLF 
TVPAPPFHV 


"5766^" 


1608 


663 


SGL.FSVDP AS SOAMEbS D VTL I EGVGNEVM WAGWVL 3 LALVL 
AWLSTYVADSGSNOLLGAIVSAGDTSVLHLGHVDKLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEFSLEHLU? 
1QG LPKRQAG AGS S£ P E AP LRS EDSTCLP PS PGLI TVK L-X FLND 
TEE1AVARPEDTVGALKSKYP*PGQESCMKLIYOGRL3.-COPARTL 
RSLN1 TDKCV I HCHRS r PGSAVPGPS ASLAPSATEPPS hC VNVG 
Sm VPVFWLLGWWY FR 3 KYRQFFTAPATVSLVGVTV' FFSFLV 
FGMYGR 


5767 


2 


8?2 


NFRATPRPPTRP*LRTGTEVILWYU)WRALMKRKRMXAJ«lktVG 
SGFPLPSSDLDDSLTEEIDEKIGFRJIDANFDWONVADFKDAGGS 
LTEVKVEEEERDPQSPEFEIEEEEEWLSSVlPDSRREKEbPDFP 
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BNSDOCID- <WO_0153312AI,L> 



WO 01/5331? 



POVUStm/3-1263 



SEO 
ID 
NO: 


Predicted 
beg i nr. ■ nc 
nucleo". 2 de. 
loca t i c-r. 
cor r e s ocnc i ng 
to l:rf. 
amine ccic 
residue cf 
amino ccic 
sequence 


i'ledicted end 
nucleotide 
location 
coj responding 
to first. 
j.r\:no acid 
residue of 
amino acid 
sequence 


Ami nc acid segment containing Signal peptide 
Kk- .alanine , ocysteine, D=Aspartic Acid. E- 
Giut^fr.ic Acid. F-l'hcnylal anine , G=Glycir.< , 
H- K> &t idine , l«=Isoleucine , K-Lysine, 
L=l.fc Jcine , MrMethionine, N=Asparagine , 
P = ^ relink, Q=Glutamine , R-Arginme, 
S=bcrine, T=Threonine, V=Valinc. 
W=Tt yptophan, Y=Tyros ine , X= Unknown , *-Stop 
Cocon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion) 








HIDEFFTIASTPSRSAYDEPHLLVNIEKQKLELL'KRRLDI EAER 
LOVE KERLO I EKERLRHLDMEHER1QLEKERLQI ERE KLR LQ I V 
NSrKPSLENELGOGEKSMLQPODlETEKLKLEHERLQLEKDRLQ 
ELK r ESEK LQ I EKERLQVEKDRLR I CKEGRLQ 


S768 




476 


S SK £ R LS VS V S P P F PG I VELG P P FAWEFCS RLGS AVTSQR AG PA 
AAK V AKD Y P F Y LTVKRANrSLELP ? ASG PAKB AEEPSN KR VKP u 
SK VTSLANLI PPVKATPLKRFSQTLQRS1 SFRSESRPD1 LAPRP 
WSRK AAPS £ 7 K R RDS KLWS ET FDVC 


576S 




667 


TK? K KG VK E iLATDCS VKAFAEi ICPELQY VGFMG CS VTS KG V I HL 
TKLKNLSSLDLRK3TELDNETAMEI 7KRCKNLI SLNLCLNVJIIN 
DRCVEV1AKEGQNLKELYLVSCKITDYAL2AIGRYSMTJETVDV 
GVJCKEI TDGG ATL1AQSS KSLRYLGLMR CD K VN EVTV E 01 > VQQY 
ph: TFSTVLC-DCKRTLERAYQMGWT?NMSAAS£ 


5770 




484 


PSKRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLP7TLTIjA 
FASQLKKT5LSLTPDVPEADLSEVDPKLVSNLMPF0RAGVNFAI 
AKGG R LLLADDMG bGKTl 0A1 C 1 AAFYRKEW P LLVW P £ S VRFT 
W E OA FL R WL P S LS PDC I NWVTG KD R LTA 


5771 


If i 


741 


GLLP£ACLR;^SWREASEGPSSRACSNGSOUTFEAeYSGTSTPS 
FHGS HCSGS DHSSLGLEQLQDYMVTLRSKLGPLE I QQFAMLLRE 
YRbGLPlQDVCTGLLXLYGDRRKFLLLGMRPFI PDQD1GYFEGF 
LEG VG I RECG r LTDSFGRI KRSMSSTSASAVRSYDCAACRPEAQ 
AFHRLLADITHME 


S772 


34* 


383 


EFNI*ALVCP£ HPQIKARDDQPLPGVLLSLSGGLFRSNLLTODNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


S773 




723 


KIPLSOEEITI.OGHAFEARIY^EDPSWFKPVAGPLVHLSTPRA 
DFSTR J ETTATROGDEVSVHYDPM1AKLVVWAADRQAALTKI.RYS 
LRQYN 1 VGI .KTNI DFLLNLSGHPEFEAGltfVHTDFI PQKHKCLLL 
S R KAAAKES 1. COA ALGL1 L X E KAMTDT FTLQ AH DQFSPFSSSSG 
RRLN 1SYTRNMTLKDGKNSK 


5774 


* 


592 


FVEFENIRVVkCGGSELNFRRAVFSADSKYI FCVSGDFVKVYST 
VT EE Oil I LHGHRNLVTG IQLNPNNHLQLYSCS LDGTI KLWDYI 
DGI lO KTF1 VGCK LHALFTLAQAEDSVFVI VKKEKPDI FOLVSV 
KL^KS.SSQRVEAKELSFVLDYINQSPKCIAFGNEGVYVAAVREF 
YLSVYFFKKETTSRVTLSSS 


5775 




538 


SSGCCDPAAP^SUvEAATMPVSKCPKKSESLWKGWDRKACRNGL 
RSCVY-AVNGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVTSGWWKGDKKSGYGZOFFGPKEYY 
EGrWCGSQR SG WGRHYYSNGDIYEGOWENDK PNGEGMLRLSQNP 
RP 


5776 


> 


484 


R LPGDCVCONLS ESLGTLCPS KGLLFVPPDlDRRTVELRiGGNF 
1 1 H } S RQDFANMTG LVDLTLS RNTI SH I QP FS FLDLESLRS LHL 
DSWLPSLGEDTLRGLVNLOHLIVNKNOLGGJADEAFEDFLLTL 
EDLDL £ YNN LHG P AVGLRGDAW VQPS TS 


5777 


1 


949 


GODPEPG0DLFOPEREVDPSWGRGREPRLGKLRF0NDHLSVLKQ 
VXKLEOALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
GG5GSEVSORVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YRGSEGSPTKPF1NPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLOSS S ES S R VDWYAQTKLGLTRTLS EEMVYED I LDPPMKENP 
YEDI ELHGRCIX3KKCVLNFPASPTSSI PPTLTKOSLSKPAFFRO 
NSERRNV 


5778 




1210 


QRROSVSRLLLPVFLLEPPAEPGLEPFPEEEGGEPAGVAiEPGS 
GGPCWLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGEPLTSPP 
WAFLGA PER PERLLNRVLER LAGGATRDS AAS DI LLDDI VLTKS 
LFLPTEKFLOELHQYFVRAGGMEGPEGLGRKQACLAMLLHFLDT 
YOGLLCEEEGAGHI 2 KDLYLLIMKDESLYQGLREDTLRLHOLVE 

tvelkifeenoppskqvkplfrhfrridsclotrvafrgsdeif 
crvywfdhsyvt1rsrlsasv0dilgsvteklqyseepagrfds 
l: lvavsssge kvllqptedcvftalgi nshlfactrdsyealv 
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BNSDOCID: <WO_0153312A1. \.> 



WO 01/53312 PCT/WS0W34263 



SEQ 
IE 
NO : 


Predicted 
beginning 
nucleot idt. 1 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anino acid segment containing signal peptidi 
IA=A:ai:ine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl al aninc , G^Giycine, 
H-Histidme, 1 r-Isoleucine , K=Lysine , 
L- Leucine, M»Methioni nc , N=Asparagi ne , 
P=Prolinc, 0«Glutamine , R=Argi::ine, 
S-Senne, T=Threonine, V=Val:ne, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, »=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








PLPEElQVSPGDTEiHRVEPEDVANHLTAFHWELFRCVHSLEFV 
DYVFHGE 


5779 


138 


1571 


EAV0VL3 KH S ADVN AR DKN WQTP LHVAAAK KA VK CAE V 1 1 PLLS 
S VNVSDRGGR TALHHAALNGHVEM\'NLLLAKGAN1 NA FD KKDRR 
ALHWAAmJHLDWALLlNKGAEVTCKDKKGYTPLKAAASNGQI 
NVVKHLbMLGVEIOElNVYGNTALHIACTNGODANAmELIDyGA 
K\^OPNNNGFTPLHFAAASTHGALCLEXiLVT^GADVTJ 1 QSKDGK 
S PLKMTAVHGRFTRSQTLI QNGGE I DCVDKDGNTPLH VAARYGH 
ELLlNTL>ITSGADT7-iKCGIHSMFPLHLAALNAHSDCCRKLliSSG 
OKYS 1 VS LFSNEHVX S AGFEI DT PDKFGRTCLHAAAAGGNVECI 
KbLOSSGADFHKKDKCGRTPLHYAAANCHFHCIETLVrrGANVN 
ETDDWGRTALH YAAA5 DMDRNKT 1 LGNAF. DNS EELERAR ELKEK 
EATLC I j Z FL LQN DAN PS I RDKEGYNS I H YAAAYGHROCLELLLE 
RTNSGFKESDSGATKSPLHLAVSEMP 


57B0 


154 


624 


OFFR V I TCLPFKGPD YRLY KSEPE LTTVAEV DE SNGE EXSE P VS 
ElETSWKGSHFPVGWPPRAKSPTPESSTlASYVTLRKTKKMK 
DLRTEUPRSAVEQLCLAESTRPRMTVEECMERIRRHCOACLREK 
KKGLNV1GASDQSPLQSPSNLRDNP 


5781 


IS 


541 


RGSLGGHPWRFPMRAASOGCL.PVSFVTGPHOERAYGGRGPGGAF 
PAPPVSGTCPPDLI YAPTPEKAEGGSOKNHQPP PGERAAHRDGE 
OAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 
VQFHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
0HS1 HTVTCKSPRQKEDRSPKPPOAPKHPEEHGRQS\0APPPL.P 
VArSRTCGGC*TWDPALLVS?/P0ODSTPELPAP\0QPTGGPSR 
CROALP PQG * ROQPxORPR / PTGASRSH PAKAKGCOGP FK I RtfY 
NIMC 


5782 


5176 


1237 


DRSKMSMAADSYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
S0PEPPVSOSE1SEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESSITI/TPVESAWAEEHEWPERPVTCtfVSETPAMSAEPT 
VLASE P PVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVLA 
ESILEPPAMAAPESSA'^IAVLESSAVTVLESSIVTVLESSTVTVL 
E?SWTVPEPPWAEPDYVTIPVPWSALEPSVPVLEPAVSV],Q 
PSMI VS EPS VS VQES TVTVSEPAVTVS EOTOVI PTEVAI ESTPM 
3 LESS I NSSHW5KG J NLSSGDONLAPEIGMQEl ALHSGEEPHAE 
EHLKGDFYESEHGIRIDI^INNHLIAKEMEKNTVCAAGTSPVGE 

1geek1lptsetkqrtvldtypgvseadagetlsstgpfalepd 
atg\tskgib:f^astlslvnkydvdi>slttqdtehdmlistsp 

SGGSEADZEGPLPAKDIHLDLPSNINLVSSD7NEPLPVKRD\DQ 
TLAALl\SLiXESSGGEKEVPPPS*REHLPDSGFSANIEDJNEAD 
LVR PVSS PRTWNVLPS PRAGlA EGP \LLASDFGPVQNLY SS PW 
\ SSMP\ ERASGS\SSGEKGG\ YEI FVKVKDTKEKSKKNXNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRSNCTJRSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSKTPSRRRRSR 
SVGFRRSFSISPSRRSRrPSRRSRTPSRRSRTPSRRSRTPSRRS 
KTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 
RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 
GVPLPPNLJtPAPPPTI EEKVAKKSGGATI EELTEKCKQ I AQSKE 
DDDVI VNKPHVSDEEEEEPPFYHHP FKI.S EPXPI FFNLN I AAAK 
PTP PK S QVTLTKEFP V SSGSOH RKK EADS V YGSWV PV EXNGEEN 
KDDDWFSSNLPSEPVDISTAMSERAIJ\QKRbSENAFDLEAMSM 
LNRA0ER1DAWAQUCS I PGQFTGSTGV0VLTQEQLANTGAQAW1 

kkdqflraapvtggmgavlmr)^igwregeglg;<nkegnxepilv 

DFKTDR KGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRW0PPEPLLVHDSGPDHRKHFLFRVL3NGSAYOPNCMFFLNR 
Y 


5783 


1693 


69b 


■ DS6LRVAFT>JEGiSNFKTPSKLSEKKXSVLCSTPTINI PASPFM 
0KU5FGTGVNVYLMKhSPRGLSKSPWAVKK1HPICNDHYRSVYQ 
KRL^EAKILKSLKHPNIVGYRAFTEAMTCSLCLAMEYGGEKSL 
NDL1 EE / PI * SQ / PK1 LFQQP/L I LKVALNMARGLKYXHQEKKL 
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BNSDOCID: <WO___0153312A1 J_> 



WO 01/53312 



PC J7US0O/34263 



SEC 
ID 
NO: 


Predicted 
beginning 
nu<:i eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
location 
correspondi r.9 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
lA=Alanine, C=Cysteine, D= Aspart ic Acid, E= 
Glut£mic Acid, ^Phenylalanine, C-Glycine, 
H=Hirtidine, I ^Isoleucine, K=»Lysine, 
1>= Leucine, M= Methionine, N-Asporagine, 
P^Prol.ne, QrGlutnmine, R-Arginine, 
S=Serar.e, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDlKiSNWlKGDFETIKICDVGVSLPLDEKMTVTDPEACYI 
GTEPViKPKEAVHEMGVlTDKADlFAFGLTLWEMMTLSIPHINLS 
NDDDDEi; KTFDES D FDDEA Y YAAU3TRPP INMEELDES YQKVI E 
LF S V CT H EDP KDR PS AAH 1 V E ALSTDY 


5784 


2665) 


1388 


PRVRPKVRTDHNYYlSRlxGPSDSASRDLHVNlDQNEKDKVKIH 
Gl LSNTHRQAARVN LSFDFPF Y GH FLRE 1 TVATGGF I YTGEWH 
KMLTATO Y I APLMAN FDPS VS RNS TVR Y FDNGTALWQWDHVH L 
QDNYNLGS FTFQATLLMDGRl 1 FGYKEI PVLVTQ2 SSTNHPVKV 
GLSDAF\'\'VHRI0O3PNVRRRriYEYliRVEL0MSKITNISAVEM 
TPLPTCLQFNRCGPCVSSOIGFNCSWCSKLO^CSSGFDRHRQDW 

vdsgcf =;eskekmcentepvet\fleppop* sroppssgs*lpp 

E/DAVTbQFPTSLPTEDDTKlALHbKDNGASTDDSAAEKKGGTL 
HAGL1 VG J LILVL1 VATAILVTVYMYHHPTSAAS JFFI EPRPSR 
WPAMKFRRGSGHPAYAEVEPVGEX.EGF J VSEQC 




2669 


1389 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDOMEKDKVK1H 
GI LSNTERQAARVN LS FDFPF YGH FLR E 1 TVATGGF I YTGEWH 
RMLTATCY I APLMANFDPS VSRNSTVR Y FDNGTALWQWDHVHL 
QDN Y S I XT- S FTFQAT LLMDGR 1 1 FGYKEI PVLVTQISSTNHPVKV 
GLSDAFWVlIRIQOlPrnTRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCL0fNRO3PC\'SSQIGFNCSWCSKL0HCSSGFDRHRODW 
; VDSGCPFESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/nAVTSOFPTSLFTEDDTKIALHLJCDNGASTDDSAAEKKGGTL 
KAGLI VG3 LILVLT VATAI LVTV YMYHHPTS AAS I FK1 ERRPSR 
W P AMK FR RGSGHPA YAEVE P VG E K EC F I VS EQC 


5786 


2532 


1674 


SYKLPAJERRASSCSQPPTPTRRRW?APGRTSKGHRPQM*SGTP 
APRPPAR£TVSPASPl,PKPRAGRCGSRPRSACSTFRPC*SLiN*M 
S* H* KRNI,SQRSS5KSRRPLSCARPHR* * RQGLTVAARLPTWAK 
SPPLACSFC0AA0KSOSLSSGRSTR* PERMS FR?\SPPGNPAIP 
SLAPSSRP/PKGRPOCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCFTRWTATPWSARASSRPRNWPTP* WR PSGRLSTV* RA 
TGGSTATAPPKRFPRKWNPMMAE 


5787 


7 


3460 


MAS AAS VTS LADEVNCP \ I COGTLK EAGSLSNCG /HKKFCRACL 
T\RYCE1P\GPD\LEESP\TCP\LCKEPFRP\GSFRPNW0LANV 
VEN1 ER LOIWSTliGIXEEDVCOEHGEKI YFFCEDDEMOLCWCR 
EAGEHATHTMRFIiEDAA\APYREQIHKCLKCbIXEREE10EI0S 
RENKRMOVbLTQVSTKKQQVISEFAHLRKFLEEOOSILLAOLES 
0DGD1 LRCRDEFDLLVAGEI CRFSALI EELEEKNERPARBLLTD 
IRSTLI RCETRKCRKPVAVSPELGQRI RDFPOOALPLQRBMKMF 
LEKLCFE LD YE PAH I SLDPQTS H PKLLLS EDKQRAQFS Y KWONS 
PDNPQRFDRATCVLAHTG1 TGGRKTWWS IDLAHGGSCTVGWS 
HDVQRKGELRLRPEEGVWAVRLAKGFVSALGSFP\TRLTLKEOP 
R0VRVSL.DYEVGWVTFTNAVTREP1 YTr TASFTRKVI PFFGLWG 
RGSSFSLSS 


5768 


2 


6 860 


EHS VSG R S S A YGDATAEGH PAG PGS VS S STGAI S TTTGHQEGDG 
SEGEGEGZTEGDWTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 
AI P YMQV I LMLTTDLDGEDEKDKGALDNLLSQL J AELGMDKKDV 
SKKNERSAI^E\^LV-VMRLLSVFMSRTKSGSKSSICESSSLISS 
ATAAAIibS SGAVDYCLHVLKS LLEYKKSQONDEEPVATSQLLKP 
HTTSS P PDMSPFFLRQ YVKGHAADVFEA YTQLLTEMVLRLP YQ I 
KKITDTNSRIPPPVFEHSWFYFLSEYLMIOOTPFVRROVRKLLL 
FI CGS KE K YRQLRDL.HTLDS \HVRG I KKLLEEQG 1 FliRASVVTA 
SP0SALOYDTL1SLMEHLKACAEIAAQRTINW0KFCIKDDSVLY 
FLUJVSFLVDEGVSPVLI^LLSCAIXTGSKVLRAIAASSGSSSAS 
SSPAPVAA.SSGQATT0SKSSTKKSKKEEK5KEKDGETSGSQBDQ 
LCrALVNOLNKFADKETLIOFLRCFLLESNSSSVRWOAHCLTLH 
I YRNS S KS QQELLLD LM WS 1 W PE LPAYG R KAAQFVDIj1*GYFS LK 
TPQTEKK LKSYSQKA VE ILRTQNHILTNUPNSN 3 YNTLSGLVEF 
DGY YLES DPCLVCKNPEVPPCY I KLSS I KVDTR YTTTOQWKLI 
GSHTI SKVTVKI GDLKKTKMVRTI NLY YJWRTVQATVELKNKPA 
R WHKAKKVQL.TPGQTE VKIDLPLP I VAS NLMI EFADFrTNYQAS 
TETLOCPRCSASVPANPGVCGNCGEimoCHKCRSllTYDEKDPF 



371 



6NSDOCID <WO__ 0153312A1 J_? 
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r szq 
:d 

NC: 


Predicted 
beginning 
nuciect ide 
location 
corresponding 
to firfct 
amino acid 
residue of 
amino Ocid 
sequence 


Freaict.ec end 
nucleot ice 
locat ic:i 
corresponding 
to firs: 
amino acic 
residue c' 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
<A=Alanine, OCysteine, D=Aspartic Acid, En 
Glutamic Acid, F- Phenyl alanine, G*=Glycine, 
H=Histidine, 1-) sol eucine , K=Lysine, 
I^Leuclne, M=We: h ion i ne, N^Asparagine , 
?=Prol3ne, 0=Glut umine, R~Arginine, 
S*Serine, T=Threr?nine, V=Valine, 
W«= Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








LCNACGFCKYARFDFK^YAKPCCAVDPIENEEDRKKAVSNINTL 
LDKADRVYHQLMGHR P01.ENLLCKVNEAAPEKPQDDSGTAGG1 S 
STSASVWRYILCLAOEVCGDCKXSFDELSKHQKVFASRKELLE 
YDLQCREAATKSSRTSVOPTFTASQYKALSVLGCGHTSSTKCYG 
CASAVTEHCI TLLRAl^TNPALRHILVSQGLI RELFDYULRRGA 
AAMREEVROLMCLLTRDNPEATOOMNDlillGKVSTALKGHWANP 
DLASSLQYEMLLLTDS ] S KEDS CWELRLRCALSLFLMAVNI KTP 
VWENI TU1CLRILQICI KPPAPTSKKKKDVPVEALTTVJCPYCN 
E I HAQAOLW LXRDPKAS Y DAWK KCLP I RG I DGNGKAPS KSELRH 
LYLTEKYVWRWKOFLSKRGKRTSPLDLKLGHNIWLROVLFTPAT 
0AAR0AAC7I VEAIAT 1 PSRKOOVbDLLTSYLDELSlAGECAAE 
YLAbYOKLZTSAHWKVYLAARGVLPYVGNLlTKElARLLALEEA 
TLSTDLQOGYAbKSLTGLLSSFVEVESIKRHFKSRLVGTVLNGY 
LCLRKLW0RTKLIDETQDMLLEMLEDMTTGTESETKAFI4AVCI 
BTAKRYNLDDYRTPVF1 FERLCSI I YPEENEVTEFFVTLEKDPQ 
QEDr LQGRMPGNPYSSNEPG1GPLMRDI KNKI CQDCDLVALLED 
DSGMEbLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRlVYRMRG 
LLGDATEEFJ ESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 
RIJVGIRDFKQGRHLLTVLLKLFSYCVKVKVNROQLVKLEMNTLN 
VMLGTLNIALVAEQES KDSGGAAVAEQVLS 1 MEI \ ICAEPNVEP 
LSEDKGNLLLTGDKD0LV1-1LLD03NSTFVRSNPSVLCGLLRI1P j 
YLSFGEVEKMOILVERKKPYCNFDXYDEDHSGDDKVFL\DCFCK 
I AAG I K\ NN SNGHQIA KDL\I LOKG I T0NALD\ YMKKH 1 P/SAA j 
R 1 WDADI \ WKSFCLRPALP F I LRLLRG LAI QH PGTQVL I GTDS 1 
PNLHKLEQVS\SDEG3GTLA\ENL\LESLREHPDVNKKIDA\AR 
RETRA£KKi<MAJ*tAMROKJiLGTIX5\MTTNEKG0WD/l^TALLEA 
DWEEL.I EEP\Gt,TCCI CREGYKFQPTXVLG1 YTFTKRWLGGVW 
EN K P R E TS RATS TVSH FN I V: I YDC \HLA \ AVS LARGRE E WES AA 
LQN AN T KCNGLLP VWG PKVPE S AFATCLARHN T Y LQECTGCR E P 
TYQLN I HD1 KLLFLRFA-^EOS FSADTGGGGRESN IHLI P Y 1 1 KT 
GLYVLNTTRATSREEKNLOGFLEOPKEKWVESAFEVDGPYYFTV 
IjALHI LPPEQWRATRVEi LRRLLV7S0ARAVAPGGATRLTDKAV 
KDYSAYRSSIj1>FWALVDL3 YNMKKKVPT5NTEGGWSCSLAEYIR 
HNDMPIYEAAt>KALKTF0EEFMPVETFSEFLDVAGLL5EITDPE 
SFLKDLLNSVP 


5789 


1 


2407 


lplhavektgrpgqpal^pgklrsdaglesdtamkkgetlrko 
teekekkskpksdktee ) aeeeetvfpkakqvkkkaepsevdmn 
s pks kkakk \ ke e psqnft i s p xtks lrk kke pie kkws sktkk 
vtkneepseeeidapkpkkmxxekemngetrekspklkngfphp 
epdcnpseaaseesnse3eqeipveqkeg\afsnfpiseetikl 
lkgrgvtflfp1 qaktfhhvysgkdli aqartgtgxtr sfai pl 
ieklhgNei^drkrgrafovlvlaptrelanovskdfsditkxl 
svacfyggtpyggqferkrngi dilvg7pgrikdhi qngkldlt 
xijwwlpevdoml0mgfax)cveeilsvaykkdsednpotli>fs 
atcphwvfnvakkymks tyeovdli gxktoktai tvehlai xch 
wtqraavigdvirvysgkqgrti i fcetkkeaqelsqnsaikqd 
aqslkg d i pqkor ei tl» k gfrngs fgvlvatnvaargldi pevd 
lviqss ppkdve syihr figrtg ragrtgvci cfyqhkeeyqlvq 
veqkag i kfkrigvpsatei ikasskdai rlldsvpptaishfk 

OSAEKL I EE KGAVEA1AAALAH I SGATSVDQRSLINSNVGFVTM 
ILQCSI EMPNI S YAWXELKEQLG E E I DS K VXGMVF LKG KLGVC F 
DVPTASVTE 1 QEKWHD5 R RWQLS VATEQP ELEGPREGYGG FRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGORSGGGNKSNRSONK 
GQKRS FS KAFGQ 


5790 


3786 




ARRQRDPLQALRRRNQELKQOVDSLLSESOLXJEALEPNKRQHIY 
QRCIQLKGAI DENKNALQKLS KADESAPVANYNQRKEEEHTLU) 
KLT0DL0GLAVTISREN1TEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEEKEENESHKW S TGEEY I AVGDFT AQQVGDLTFKKGEI 
LLV I E KKPDG WW I AXDAXGNEGLVPRTYL E P YSEEE EGQES SEE 
GS EBDVE AVDETADGAEVKNOR TDFH WS AVQXAI SEAGI FCLVN 
HVSFCYLlVWJRNRMETVEin^GSETGFRAWNVOSRGRIFLVSK 
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NO : 1 

I 


Predicted 
becjinninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucifcct jde 
iocatic:. 
corresponding 
to firsl 
amino ncid 
residue of 
amino acid 
sequence 


Amino ocic sccment containing signal peptide 
(Alanine, C=cystcine, D^Aspartic Acid, e = 
Glutamic Acid, F^Phenylalanine, C=Glycine, 
H=Histidine, 1 = 1 soleuca ne , X=Lysme, 
L=Leucine, M=Methionine, N-Asparagine , 
P^Proline, 0=Glut amine, R^Arginine, 
S=Serine, ?=?hreonine, V=Valine, 
W-Tryptophar. , Y*Tyrcsine, X=Unkr,own, *=Stop 
Cocon, /apcssible nucleotide deletion, 
\=possible nucleotide insertion) 








PVLv ) OlNTVDVLTi'MGAIPAGFKPSTLS0LLEEGN0FRAKYFL0 
PELViPSCJLiAh'WnLMWDATEGTI RSRPSRISLIbTLWSCKMI PLP 
GMSlQVLSRHVRLCLFDGKKVliSNIHTVRATWCPKKPXTWTFSP 
QVTR I LPCLLDGDCFI RSNSAS PDbG I LFELC- 1 SYIRNSTGERG 
ELSCGKVFLK LFDASG VP1 PAKTYELFLNGGTPYEKG I EVDPS I 
SR RAHGS VF Y 01 WTMRROPO LLVKLRS bNRRS RNVLS LLPETL I 
GNMCSIHLLI FYR0ILGDVLLKDRMSliQSTDL2SHPMIiATFPMb 
LE0PDVMDAliRS5WAG0ES\TLKKSEKR\PK^:FLKVPRrLLVYH 
XGCVLPLL/HTPTRl.PPFRWAEEETETARWKVITDFLKQNQENQ 
GA LQAL LS PDG VHE P FDLS EOT Y D F LG EMRKNAV 






163E 


LRVAEFAGTSR/3GAGLI0PLHRAPARDHGL1.RGGAAPALSVSH 
GN/GKQL/W4ES0GSDDEQIKRENIRSLTMSGHVGFESLPD01*V 
NR S I QQGFCFN I LCVGETGI G KSTLI DTLFNTNFED YESSHFCP 
KVKLKAQTYE LOES NVOLKLT I VKTVG FGDO 1 NKEES Y 0 ? I VDY 
1 D AO FEA Y L0 EELK 1 KR S LFT Y H D S R I KVC LY F I S PTGH S LKTI* 
DLLTMKNLDSXVYJ 1 PVIAKADTVSKTELQKFXIKLMSELVSNG 
VQ I YOFPTDDDTI AKVT^AAMNGOLPFAWGSKDEVKVGN XMVKA 
ROVPWGWQVEMEKHCDFVKLREMLI CTNMED LREQTHTRHYEL 
YR R CKLEEMG Fl'DVG P EN KP V£ VO E T YE AK R H E FHGERQR K EE E 
MKQMr VQRVKE <EA J LKEAER ELQAXFEHLKRbHQEERMKbEEK 
RRLLEEEI1AFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 
FVKOKVPEHRKSSSQANFIKKKLEVCFDFAVICFITSIFGEQPO 
LLI FMEKYFQVOG0YISQSE 


5792 


2263 


651- 


AAAAPSPAWWCGVFWYWHTCWVMYGIVYTRPCSGDASCIQPY 
LARRPKLQL\RHS FTTTRSirLGAENN I DLVLNVEDFDVESKFER 
TVNVSVPKKTRNNGTLYAYIFLHHAGVLPWHDGKOVHLVSPLTT 
YMV PKP EE INLLTG E S DTQQI EADKKPTSALDE ? VSH WR PR LAIj 
rA^lADNFVFDC;sSLPADVHRYWKMl OLGKTVHYLPILFI DQLSN 
RVKDLMVINRSTTELPLTVSYCKVSLGRLRFKIHMOEAVYSLOQ 
PGFSEKDADEVKG I FVDT*NIiYFLAtiTFFVAAFHLLFDFI>AKKMD 
I S FWKKKKSM 1GMST KAV LWRC FSTWI FLFLLDEQTS LLVL VP 
AGVGAA I ELWKV K KALKMT1 FWRGLMPEFQFGTYSESERKTEEY 
DTQA^KYl.SYLLyPLC^'GGAVYSLl^IKYKSKYSWLINSFVI^GV 
YAFGFLFMLPOI^FWYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
HTNPTSHRlACFRDDWFLVYlYORWbYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPS PAWWCGVr WYVVH TCW VMYG I VYTR P CSGDASC1 QP Y 
LARRPKLQ1AR HS FTTTR S H LG AE1W 1 DLVLNVEDFDVES KFER 
TVNVS V p KKTRNWGTbY A Y 1 FLKHAGVbPWHDGKOVHLVSPLTT 
YHVPKPEEINbLTGESDTQOIEADKKPrSALDEPVSHWRPRLAL 
NVMADNFVFDGSSLPAE-VHRYMKNl QLGKTVHYLPILFIDQbSN 
RVKDLM VINRST?ELPbTVS YDKVS bGRLRFWI HMQDAVYSLOQ 
FGFSEKDADE VKG I FVDTNLY FLALTFFVAAFKbbFDFLAFKND 
I SFWKKKKSM3 GKSTKAVLWRCFSTW1 FbFLbDEOTSLbVLVP 
AGVGAAIELWKVKKALKMT2FKRGLMPEFQFGTYSESERKTEEY 
D7X?AMKYbSYbLYPbCVGGAVYSbLNIKYKSWy5WLIWSFVNGV 
YAFGPbFMLPOLFVNYKLKSVAHbPWKAFTYKAFNTFIDDVFAF 
1 1 TMPTSHRLACFR DDWFLVYLYQRWL YPVDKRRVNEFGES YE 
EKATRAPHTD 


5794 


1 


50 1€ 


^K;PRXSVWbbLLPAALIAHEEHSRAAAKGGCAC-£GCGKCDCHGV 
KGQKGE RGIjPGIjQG" IGF FGMQGPEGPQGPPtiOJviaU A\7tFVjJLr%» 
TKGTRG P PGASG YPGNPGL PG I PGQDGPPGPP3I PGCNGTKGER 
GPLGFPGLPGFAGKPGPPGbPGMKGDPGE I LGHVPGMLLKGERG 
FPGIPGTPGPP<3bPGUJ3PVGPPGFTGPPGPPGPPGPPGEKG0M 
GLS FCJG PKGDKGDOG VSGP PG VPGOAOVQEKGDFATKGEKGQKG 
EPGFOGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\OGEKGEAGPPGPPGIVIGTGPIiGEKGERGYFGT 
PGPRGEPGPKGFPGLPGQPGPPGbPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 
GDQGPPG I PGQPGF I GEIGEKGQKGESCblCDI DGYRGPPGPOG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPOGTPGLIGOPGAK 
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SEQ 1 
ID 
NO: 


Predicted 
beganninc 
nucleot ice 
location 
correspond! ng 
to first 
amino acid 
residue o* 
amino acid 
sequence 


Predicrcd end 
nuclec: idt 
locet i cr. 
corresponding 
to firrt 
anuno £>cid 
residue of 

secAience 


Arr.ino ecid seomem containing signal peptide . 
(A-Alar.i ne, C^Cysteine, D-Aspartic Acid. £s 1 
Glutamic Acid, F* Phenyl alanine, G«Glycme. | 
F>Histicine, 3=Iso)eucine, K=Lycine < t 
L^Leucine, M«Methioninc , N«=Asparagine , 
K= Proline, Q=Gl\it amine, R-Arginine, 
S= Seri ne , T=Threonine, V=Valine, 
W= Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
Vpossible nucleotide insertion) 








GEPGEr YFBLRI,KGDKGD?GFPGQPGMPGRAGSPGRDGKPGLPG 
PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGFPGGPGSPGLPGPKGE PGKI VPLPGPPGAEGLPG5PGF 
PGPOGDPGFPGTPGR\PGL\PGEKGAVG\OPGIGFPGPPG?K3V 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGJ.KGL 
PG L PG I PGTPGE KGS I G VPGVPGEHG A I GPPGLOG IRGEPGPPG 
LP G SVG S PG VPG I GPPGARG PFGGQG P PGLSGP PG I KGE KG F PG 
FPGLDMPGPKGBKGAOGLPGITG0SGLPGLPG0OGAPG1PGFPG 
SKGEMGW«GTPGQPGSPGPWGAPGLPGEKGD\11GFPGSSGPRGD 
PG1-.KGDKGDVGI/PGKPGSMDKWKGSMKGQKGDOGEKGQ1GP1G 
EKGSRGDPGTPGVPGKDGOAGOPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPG1PGPOGSPC-LPGDKGAKGEKGQ 
AGPPGIG1PGI.RGEKGDOGIAGFPGSPGEKGEKGSIGIPGMPGS 
PG I ; KG S PGS VG Y PGS PG L PG E KGDKG L PGLDG 3 PG VKGE AG L?G 
TPGPTGFAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKG.'E KGEVGFPGLAGSPGI PGSKGEOGFMGPPGPQGOPGLPGSP 
GHATEGPKGDRGPOGQPGLPGLPGPMGPPGLPGIDGVKGDKGKP 
GWPGAPG V PGPKGDPGFOGMPGIGGS PG I TGS KGDHGPPG VPGF 
<?GPKGLPGLQGI KGDOGpQGVPGAKGbPGPPG PPGPYDI 1 KGEP 
GLPGPFG ? PGLKGLQGLPGP KGOOGVTGLVG I PG PPG I PGFDGA 
PGOKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDKGFL, 
VTRHSQTIDDPQCPSGTK1 LYHGYSIiIjYVQGNERAHGQDIiGTAG 
SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSKAP 
1 TG RN1RPFTSR CAVCEA PAM VMAVHSQTI Q I p PCPSGWSS LW I 
GYS I--VMHTSAGAEGSGOALASPGSCLEEFRSAPFI ECHGRGTCN 
YYANAYS FWLAT J ERSEh'FKKPTPSTLKAGELRTHVSRCOVCMR 
RT 


5795 


1192 


63 


STR SPTVEY 1 SAH PHI LFMLLKGYEAFQ I ALRCG 1 MLREC1 R HE 
PLAKlll,FSNOFRDFFKYVELSTFDIASDAFATFKDLLTRHK\nL> 
VADFLEONYDTIFEDYEKbLOSHNYVTKRQSLKLl,GELILDRl{N 
FAI MTKY 3 SKPENLKLMMNLLRDKSPKI QFEAFHVFKVFVASPH 
KTQ P 1 VE 2 LLKNQPKLI EFLSS FQKERTDDEQFADEKNYLI KOI 
RDI.K KTAP ♦ RALR VS KR 


S7 96 


- 2 


107F 


GRVGWElvWCMYISPPKDWWDAGDPSLPIRTPAMJGCSFWNRKF 
FGE I GL1»D PGMDVYGGEN I ELG I K VWLCGGSME VL PCSRVAH I E 
R KKK PYWS N I GFYTKRNAUi VAEVWMDD YKS H VY I AWNLPLEKP 
G IPI GDVSERRALRKSLKCKNFQWYLDHVr PEMRR YNNTVAYGE 
LRJW KAKD VCLDCK5 PLENHTA I LYPCHGWGPQLARYTKEG FLHL 
GAl^TTTLLPDTRCLVDNSKSRLPQbLDCDKVXSSLYKRWNFlO 
NGA1 MNKGTGRCLEVENRGLAC IDLI LRSCTGQRWTI KN3 1 K*R 
EGAG ALEPG P QDMAAP PN 1 WTSCPGGETARGRQ VLDG P PRAS PG 
QHRDPG 


5797 


2 


09: 


PRVRQKTLVDVTLENSN 1 KJDQI RNLOQT Y EASMDKLR EKQRQLE 
VA0VENQLLKWKVESS0EANABVMREMTKKLYSOYEEKLOEEOR 
KHSAEKEALLEETTJSFLKAIEEANKKMC^AEISLEEKDQRIGFX 
DRLIERMEKERHOIX)L0Ll>EHETEMSGELTDSDKERY00LEEAS 
ASLRER IRHLNDWVHCQgKKVKQMVEElESLKXKLQQKOliLI LQ 
LLEKISFLEG ENNELQS RLDY LTE TQA KT EVE7RE I G VGCDLLP 
SQTGR TRE I VM PS JRK YTP Y TR VLE LTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNOEKOPYYEEOARLSKlHLEKiPNYKYKPRPKR 
TCJVDGKXLRIG2YK0LMRSRROEMR0FFTVGQOPOIP1TTGTG 
WtfPGAITttATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGS 1.AGNEM INGEDEMEM YDDYEDDPKSDYSSEHEAPEAVS AN 


5799 


2679 


14 2 5 


LL ST Y1KF2HLF? BT KAT I QG VLRAG SQLRNAD VE LQQRAV E Yb 
TLSSVASTDVLATVLEEMPP FPERESS I LAKLKRKKGPGAGSAL 
DDGRRDPSSND INGGKE PTPSTVSTPS PSADLLG1>RAAPPPAAP 
PAS AGAGNL1.VDVFDGPAAQ PSLGPTPEEAFLSPGPEDIGPP I P 
EADELLNKFVCKNNGVLFENOLLQIGVKSEFR0NLGRMYLFYGN 
KT£VOF0NFSPTVVHPGDI^TOI^V0?KRVAAC\aX3GAQV00Vl> 
N1ECLRDFLTPPLLSVRFRYGGAP0ALTLKLPVTINKFFCPTEM 
AAODFFORWKOLSLPOQ^OKIFKANHPMDAEVTKAKLLOFGSA 
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SEQ 
NO: 

1 

! 

1 
i 

1 

i 


Predicted 
beginning 
nucleotide 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


predicted end 
r.ucl eot ide 
j oca tier, 
corresponding 
to firot 
amino acid 
residue of 
amino acid 
sequence 


Amino cCid secnent containing signal peotade 
(A^Alan- ne. C=Cvsteine, DrAspartic Acic, 
Glutamic Acid, F=Phenylalanine , C-Glyc:ne, 
H=Hist :: dine, i-Isoi r-ucine, K=Lysmc, 
Ii-I.eucmc, M=Nethionnne / N-Asparnc^mc, 
P=Proline, C^Giutanune , R.-Arginine, 
S=Ser.ir»e, T- Threonine, V=Veline, 
w=Trypicphan, Y= Tyrosine, X= Unknown, *-Stop 
Codon. /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






LLDNVCPNPENFVGAGIlOTKALOVSCLLRLEFNAOAOXiVRLTL 
Ki Sj<EFVSRHljCEIjliAv?yF 


I 5800 


2679 


143b 


LLSTy I KF: NLFPETKATI0GVLRAGS0LRNAD\^ELOORAVEYb 
TliS S V ft S TDV bATVLE SMP P FPER ESS I IiAKLKRXKGPG AG S AL 

ddgrrtpssitoi>3ggkhptpstvstpspsad: j lglraap?paap 
pasaga g nl.lvdv fdg ?aaq ps lg ptpeeaf^s pg ped j gpp j p 
e ad eli nkfvcknmgvlfenollqigvksefronlgrmyl.fvgn 
ktsvqfohfs?tv\'hpgdlotqliav0tkrvaaqvdgga0vqovl 
nieclrdfltppli.svrfryggapoaltlklpvtinkffcptem 
aaodff0kwkouslp00ea0k1fkawhpmdaevtkakllgfgsa 
lldnvd p n penkvgag 1 1 0t kalqvgcllrle ?n ao aqm y rltl 

RTSKEPVSRHbCELLAQQF 


5801 

j 


3 


i4i:? 


FPRLYHL 1 PDGEITS I XINRVDPSESLS JRLVGGSETPLVHI 1 1 
QHiyRDGVIARDGKLLPGDIlLKVNGMDISNVPHNyAVRLLRQP 
CQVLWLTWRSOKFRSRNNGQAPDAYRPRDDSI-irVILNKSSPEE 
QLG I K LVT< KVDEPGV Fl FNVLDGG VAY RHGQLEENDRVLAI NGH 
DLRYGSFESAAHLIOASERRVHLWSRQVRQRSPDIFOEAGWNS 
NGSWSPGPGERSNTPKPLHPT1TCHEKWN1QKDPGESLGMTVA 
GGASHREWDLPI YV1 SVEPGGV J SRDGR3 KTGD I I.LNVDGVELT 
BVS RS EAVALLKRTSSS I VLKALEVKE YE PQEDCSS PAALDSNH 
NMAPP S D WS P SWVKK LE LPR CLYNCKD I VLRRNT AGS LG FC I VG 
GYEEYNCNKPFFIKS1VEGTPAYTJDGRIRCGDILLAVNGKSTSG 
MIHACLARLLKELKGRI TLTI VSWPGTFL 


580> 




290 


CFSLYQJ KLRIMDLPTLLRHAFREMFSVGGLFWKFR 1 Rl I LCLM 
GAFFYI,3SPbDFVPEAl>FGILGn.DDFFVIFLLLIYISlMYREV 
JTQRLTR 


5803 


2234 


1299 


EAOFGTTAEIYAYREE0DFG1BIVKVKAIGR0RFKVLELRT0SD 
GIQQAKVQ1LF5CVLPSTMSAV0LESLNKCQZFPSKPVSRED0C 
SYKWWOKY0KRKFHCAKLTSWPRWLYSLYDAETLMDRIKK0LRE 
KDENLKDDSLPSNFIDFSYRVAACLPIDDVJuRIOIiLKIGSAIQR 
LRCELDJMNKCTSICCK0C0ETE1TTKNEIFSLSLCGPMAAYVN 
PHGYVHETL7VYKACNLNLIGRPSTEHSWFPGYAWTVA0CKICA 
SH1GWKF TATKKDy.S P 0 KFWGLTR SALLPT2 PDTEDE ISPDKV3 
LCL 


5804 


•J. 


1707 


EMEK0RCFECRKR1EEERKRR1E0DMLEKRKI0RELAKRAEQ1E 
DINNTGTH'SASEEGDDSLLITWpVKSYKTSGKMKKNFEDltEKE 
RBEKER 1 KYEEDKR 1 RYEEQRPSLKEAKCLSLVMDDEI ES EAKK 
ESLSPGK1.KLTFEELERQRQENRKK0AEEEARKRLEEEKRAFEE 
ARRQMVNEDEENODTAKIFKGYRPGKLKLSFEEKERQRREDEKR 
KAEEEARKK I EEEKKAFAEARRNMWDDDSPEKYKTISOEFLTP 
GKIjEINFEELbKQKKEEEKRRTEEERKHKLEMEKOEFEO^ROEM 
GEEEEENETFGLSREYEEMKLKRSGS3QAKNLKSKFEK3GOl>S 
EKE I OK K 1 fc L£.RARR RA I DLE I KER EAENFHEEDDVTJVN PARKS 
EAP FTHKVNMKARFEQMAXAREEEEQRRIEEQKLLRMQPEQREI 
DAAbQKKREEEEEEEGSIMNGSTAEDEEQTRSGAPWFKKPbKNT 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


Y I S DTbGCV Y KS K I R W W I EENGGNGN I S VDDIj J ALLDLAEHASS 
AFKESQOQS EDREYEVKERLYPKSXRRYT3TYN1AGYQGE1 EVGL 
YTI QJ LQLl PFFDNKNELSKRYMVNFVSGSSD3 PGDPNNE Y KLA 
LKNYIPYLTKLKFSLKKSFDFFDEYFVLLKPRNNIKQNEEAKTR 
RKVAG Y FKK YVD I FCLLEESQNNTGLGSKFSBPLO VERCRRNLV 
ALKADKF SGLLEYLJ KSQEDA1 STMXCI VNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPM 
ASMWLR S LLKP I HVFFGAAI LSLSIAS VISGINEKLFFSLKl^TT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRF 


5807 


2267 


1302 


RFS KKT FRR PMAVD1 QPACLGLYCGKTLLFKNGSTE I YGECGVC 
PRGQRTWAOKYCQPCTESPELYDWLYIGFMAWLPLVLHWFFIEW 
YSGXKSSSALFQHITALFECSMAAIITLLVSDPVGVLYIRSCRV 
LMLSDW YTMLYNPSPDYVTTVHCTHEAVYPLYTI VFI YYAFCLV 



375 
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SEQ 
ID 
NO: 


Predicted j 

beginning. 

nucleotide 

location 

cor re. spending 

to firs: 

amino acid 

residue cf 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
r.nino acid 
r.eguence 


Amino acid segment containing sicr.a^ pept ide 
(A-A:anme , OCysteine, D=Af.pertic Acid, £- 
Glutamic Acid, r=Fhenylalani nr. , C-GlyciriC, 
H*Hast:dire, 1= Isoleucine, K-Lysine, 
L^Lei:c:ne, M=Methionine, N«A.':p=racine , 
PsPrcjine, C-CIutamine , R^Aroinine, 
ScSerjne, T-Threonine, V=Val^nc, 
W=Tryptophan / Y=Tyrosine, X=Ur : Rnown, *=Stop 
Coder. , /^possible nucleotide deleft ion, 
\=possible nucleotide insertion; 








LMMLLRPLLVKKj ACGLGKSDRFKSIYAA1 YFFP1LTVL2AVGG 
GLLyYAFPYI ILVLKLVTLAVYMSASE1 EKCYDLLVRK.KRLJ VL 
FSKVJLLHAYGl I SISRTOKLEQPLPLiALVFTr ALFYLFTAKFT 
EPSRJLSEGANGP: 


5808 


; 


433 


SL P DSG V V E Y IS NGG VADNKKDFG E LR Y N F C 1-MN FS CNG KNG S S 
EGR I THGFOLKS^.YENNLMPYTNYTFDFKGVI^Y J FYSKTHf4NV 
I«VLGFLDPOWLVENNI1X3CPHPHIPSD>IFSLLTOLELKPPI>LP 
LWGVHLPKRR 


5809 


464 


2422 


1 LVPGFOGi LHPGVYCALQSQHQAQELV/vM DSCEVSGLCRKGG 
RCV>ITHGSFECYCNDGYl>PFNGPEPFHPTTDArSCTEIPCGTPP 
EVPDGYI IGNYrSSLGSQVRYACREGFFSVpRDTVSSCTGLGTW 
ESPKLHCQEINCGNPPEMRHA3LVGNHSSKLGGVARYVC0EGFE 
SPGGKITSVCTEKGTWRESTLTCTEI1.TK!: HDVSLFNBTCVRWQ 
1 NSRK1NPK I S YV1 S I XGQRLDPMESVSEFTVNLTTDSRTPEVC 
LALY PGTNYTVN1 STAPPRRSMPAVIGFOTAEVDLLEDDGSFNl 
SIFNETCLKLNRRSRKVGSEHMYOFTVLGC^WYIANFSHATSFN 
FTTR E0VPVVCLD1,YPTTDYTVNVTLLRS PKRHSVQ1T1 ATPPA 
VKQT2 17NJ 5GFNETCLRWRS3 KTADMEEM Y L FH I WGQR W YQKEF 
AQEMTFN1 SSSSRDPEVCLDLRPGTNYNVS1 RAISSBLPWISL 
TTOITF.PPLPEVEFFTVHRGPLPRLRLRW-.KEKNGPISSYQVbV 
LPLAbOST F S CDS EGASSFFSNAS DADGY V AAELLAKDV P DDAM 
EI PI GDKLY YGFYYNAPLKRGSDYCI I LR J 7SEWNKVRRHSCAV 
WAQV KDSS 1>MLLC^AGVGLGSLAWI I LTFLSFSAV 


SB10 




164} 


KVFGTHKDHEVSTbDTAI SAVKVQbAEFLENLQF.KSLRl EAFVS 
BlF.SFrNT3EENCT:KNEKRLEEQNEEMMKy:VLAOYDEKAOSFEE 
VKKKKMEFLHEQKiVHFLOSMDTAKDTLETSVREAEELDEAVFLT 
SFEE I N^RbbSAMESTASLEKMPAAFSLF EKY DDSSARSDO«bK 
QVAV P0PPRLEPOEFNSATSTT I AVYWEMN K EDV 1 DS FQV YCWE 
EPQDDOEVNELVEEYRLTVKESYCIFEDLEFDRCYQVWVKAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCU'KTATIRWRPTTPEA 
T ET Y T L.E Y CRCHS PEG EG LRS FS G I KG LC L KW L.Q PNDN Y F FYV 
RAJNAF3TS EQSEAAL JSTRGTRFLLLRETAifPALHISSSGTVl 
SFGER^RLTEIPSVXGEELPSCGQHYWETTVTDCPAYRLGICSS 
SAV0AGAU5QGETSWYKHCSEPORYTFFYSGIVSDVHVTERPAR 
VG1 LLDYNNORLI Fl NAESEQLLFI I RHRFHEGVHPAFALEKPG 
KCThHhGlBPPDSVJUiK 


5811 




fibl 


AAAIADPLPEDKWSAEKRRPLKSSLGYEI'IKSLLNPDPICSHDVY 
WD1 BGAVRK YVQPFLNALGAAGNFSVDS01 LV YAMLGVNFRFDS 
ASSSYY^MHSLP^INPVESRTXJSSAASLYPVLNFLLYVPELA 
HE PLY 2 ODKDGAPVATNAFHSPRWGGIMVYNVDSKTYNASVLPV 
RVEVDMVRVMEVFLAOLRLLFGIAQPQLPFKCLLSGPTSEGLMT 
WELDRbLWARSVENLATATTTLTSLAOLLGKISNIVlKDDVASE 
VYKAVAAVOKSAEELASGHLASAFVASOEAVTSSELAFFDPSLL 
HLLYFPDDOKFAI YIPnFLPMAVPILLSLVKl FLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGK OR CQRG R S CG ARE BE VE PGTAR P P P AA £ AMDAS LEK J ADP T 
LAENG KNLK £ A VKMLEDSQR RTEEENGK KL I SGDI PGPLQGSGQ 
DMVS 2 LOL>VONIjMHGDEDEEPQS PR1 QNIGECGHMALLGKS LGA 
YISTLDKEKLRKLTTRIIiSDTTLWLCRIFRYENGCAYFHEEERE 
GLAK J CRLA I HSR Y EDFWPGFNVLYNKXP VI YL SAAAR PGLGQ 
YLCN0LGLPFPCLCRVPCNTVFGSQHOMDVAFLEKXIKDD1 ERG 
RLPLLLVANAGTAAVGHTbXIGRLKELCEOYG I WLHVEGVNLAT 
LALG Y VS SS VLAAA KCDSMTMTPGPWLGLPA V FAVTLYKhDDPA 
I>TLVAGLTSNKPTDKLRALPLWL^L0Y^GLD3FVERIKHACOliS 
ORLOESLKKVNYIKILVEDELSSPVWFRFFOELPGSDPVFKAV 
PVPKmPSGVGRERHSCDAIJWWLGEObKOLVPASGLTWDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDOtVACl ESKLPVLCCTLQLR 
EBFKOEVEATAGLLYVDDPNWSGIGWRYEKANDDKSSLKSYPQ 
GEMIHAGLLKKLKELESDLTFKIGPEYKSKKSCLYVGMASDNVH 
AAELVETIAATARE I EDNSRLLENKTE WRKG 1 QEAQVELQKAS 
EERLLEEGVLRQIPVVGSVI>MWFSPVQALCKGRTFWLTAGSLES 
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SEQ 

3D 

NO: 


Predicted 
beginning 
nucl eot. i de 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleoti de 
locatior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/.nine acid r.ecment containing siynal uept ide 
(/^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acic, F=Phenylalanine, G=Glycine, 
H=Histidine, I = Isoleucine, X^Lysine, 
L=Leucine, M-Methionine, N^Asparagine, 
PsTroline, Q= Glut ami ne , R^Arginine, 
Sparine, T= Threonine, V* Valine, 
WeTryptopnon, Y=Tyrosine, X^Unknown, *=Stop 
Coaon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








TE7 I YVYKAQGAGVTbPPTPSGSRTKQRLPGQKPFKRsLRGSDA 
LSKi'SSVSHIEDLEKVERLSSGPEOITLEASSTEGHPGAPSPQH 
TDCTEAFQKGVPKPEDDHSQVEGPESLR 




2 93C 


699 


H R DG VSG S L ER PLTDR S R TGA FAQQRG KMATAGGG SG AD PG S RG 
LLKL>LS FCVLLAGLCRGNS VERKI Y I PLN KTAPCVR LLNATKOI 
GCCS SISGDTGVI HWEKEEDLQWVLTDG PNPpYMVI.LESKHFT 
RDLM EKLKGRTS R I AGLAVSLTK PS PASG FS P$ VQCPNDGFGVY 
SNSYGPEFAHC?E3QWNSLGNGIAYEDFSFPIFLLEDENETKVI 
KOCYQDKKLSQNGSAPTFPLCAMOLFSHWAWLSFSTA?\CMRRS 
S 2 OS TFS INPKI VCDPLSDYNVWSMLKP1 NTTGTliKPDDRVWA 
ATR LDSRS FFWNV \ A PGAE5AVAS F VTQLAAAEALQ KA P DVTTL 
P RN V MF V F FQGETFDY I GS SRMV Y DMEKG KFP VQLENVDS FVEli 
GQVMjRTSIiELWKHTDPVSOKNESVRNQVEDLLATLEKSGAGVP 
AVIbRRPN0S0PLPPSSL»0RFLR7\RNlSGWLADHSGAFHNKYY 
QS1 Y DTAEN INVS Y PEW UEP LKE / ET WN FG * ODTAKALADVATV 
LGRA DYE LAGGTN FSDTVOADPQTVTRLLYG\ FLI KANNS WFOS 
I LOG RDLflSYLG * RGLFQH\ YIAV\ SSPTNT3 YV/VLQYALANL 
TGT WNl/TR EQCQDPSKV PS ENKDL Y E Y S WVQG P U1S N ETDRLP 
R CVR STARLARALSPAFELS QWSS TEYSTWTES R WKD I RAR I FL 
IASKELELITLTVGFGILIFSLIVTYCINAKADVLFIAPREPGA 
VSV 


SS1A 


8000 


432 


AIjKCRP.RRVLAILVoPVOPDRMAEF.GAVAVCVRVRPLNSREESL. 
GETAOVYWKTHNNVIYPVDGSKSFNFDRVLHGNETPKNVYEA\l 
AAV • 3 DSATQGYNGTIFA\YGOT\ASGKTYTMMGSEDKLGVIPQ 
GCPHGHPSQKJ * EVFLDREFLLRVS YME3 YNBTI TDLLCGTQKM 
KPU I RSDVKRKVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNORSSRSHTIFRMlLESREKGEPSNCEGSVKVSHLNLVDLA 
GSER AAOTGAAG VRLKEGCN I NRSLF3 LGOV2 KKLSDGQVGGFI 
NyFDSKLTRILONSLGGNPKTRIlCTITPVSPDETLTAI.OKAST 
AKYMKNTPYVNEV£TDEALLKRYRKEIMDl,KKQLEEVSLETRAQ 
AMEK DQLAQLLE E K DL1QKVQN EK 1 ENl/TRML VTS S S LTLQQ3L 
KAKK KRRVTWCIXJKlNKMKWSKrYACOFNl PTN3TTKTHKLS3NL 
LRE J DES VCSESDVFSNTLDTLSE3 EWNPATKLLNQEN I ES ELN 
SLRADYDKLVLDYEOLRTEKEEMELKLKEKNELDEFEALERKTK 
KDOEMQL 1 HE1 SNLKNLV WIRE WNQDLENELSS KVEU.KE KED 
QIKKLQEYIDSQKL.ENIKWDLSYSLESIEDPKOMKQTLFDAETV 
AIXAKRESAFLRSENLELKEKMKELATTYKOMENDIOLYQSOLE 
AKKKMOVDbEKELOSAFNElTKbTShlUGXVPKDLLOJLELEGK 
I TD1 OKELNKEV EENEALREF V I LLSELXS LPSEVERLRKE IOD 
KSERLHI 1 TSEKDKL FSEWH K£S R VQGLL E E I G KT KDDLATTQ 
SNY K STDQSFQN FKTLHMDFEQKY KMVLEENERMNQE I VN1>SKE 
AQKF DSSLG ALKTELSYKTOELQEKTREVQERLNEMEOLKEQLE 
NRDSPLOTVEREXTLITEKLQOTLEEVKTLTOEKDDLKOLOESL 
OTERDOLKSDIHDTVNMNIDTOEOLRNALESLKOHOETINTLKS 
KI SEE VSRNLHMEENTGETKDEFOOKMVGI DKXQDLEAKNTQTL 
TADVKDWEI IEQQRKI FSLIOEKNELOOMLESVIAEKTQLXTDL 
XEN3 EMTJ ENQEELRLLGDELXXQ0E1 VAOEKNHAI KXEGELSR 

tcdr:j\eveeklxeksoqu?ekqqqllnvoeemsemqxkineie 
nlxnelkn keltlehmeterlelaqklnen yeevks i txer kvl 
relok s feterdhlrg y 1 re 3 eatglotkeelki ah 1 hlkekqe 
ti delrrs vsektaqi intqdlexshtkloeeipvlheeoellp 
k^xwsetoetmnelellreosttkdsttlariemerlrbnekf 
0es0ee3 ksltkerdnlktikealevkhdolkehi retlak iqe 
sqs kg eqs lnnkekdmettx1 vsemeqfkpkdsallr i e 3 emlg 
lsxrlqfshdemks vaxekddlqrlqevlcsesdqlk eni ke 3 v 
akhleteeelxvahcclkeqeetirelrvnlsekete 1st3qxq 
lea i ^jdklonkiqei yekeeqlnikq3sevqexvnelkofkehr 
kaxesalos i eskmleltnrjbqesoeel q3 m i xekeemxrvqea 
w3 e hdqlkentke 1 vakmke sqexeycflkmtavnetoekmce 

IEHLKECFETQICLNLEH1ETENIRLTQ3LHENLEEKRSVTXERD 
DIiR5VEETLKVERDOLX£NIjRETITRDLEKOEEI.KIVHKHLKEH 
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ID 
KO - 


Prec: cted 
hea-r.ninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleot ice 
location 
corresponding 
to firsi 
amino acic 
residue oi 
amino acic 
sequenct 


A-nir.o acid segment: containing signal peptide - i 
(A^Alanine, C^Cysteine , D^Aspartic Acid, E- ■ 
Glutamic Acid, F=pher,ylalanine , G^-Glycine, 
H=Histidine, I=Iso3 nucine, K=Lysane, 
L*Leucine, M=Meth.ion:ne, NrAsparagine , 
PeProline, Q=Glutamine, R^Arginine, 
StSerine, T=Threcni ne , V*Valine, 
W«> Tryptophan, Y-Tyrosme, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\cpossible nucleotide insertion) 








CETIDXLRGIVSEKTNEISNHOXDLEKSNDALKAQDLXlQEELR~ 
1 AHMHL KEQQET 1 DK L*G I V ? E XTDKLSNMQXDLENSNAXLOE X 
I QE LKAK EHQLI TLK KDVN ETQXKVS EM EQLXXQ I KDQS LTLS K 
LEIENLNU^OKLHENLSEMKSVMKERDNLRRVEETLKLSRDQLK 
ESbQETKA^Di^lOOELrCTAitMLSKEHKETVDKLREK 1 SEKTI Q 
I SDIQKDLDXSKDEL0KK1 CELQKKELOLLRVKEDVWMSHKKI N 
EMEQLKKOFEPNYLCKCEMCKFObTKKLHESLEEIRIVAKERDE 
LRRIXESL.XWERC0FIATLREMIARDRONHQVKPEKRLLSDGQO 
HbMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
Ri MKKI KYVLSY VTKI KEECHFCINKFEMDFIDEVEKQKELLI K 
3 QHLOODCDVPSRE LRDLKl^ONMDLH 1 EEI LKDFS E £ EFPS 1 K 
TEFOCVLSNRKEMTOFLEEWLNTRF DIEKLKNGIQKENDRICQV 1 
NNFFNNRJ lAIMNESTEFEERSATISKEWEQDLXSLXEXNEKLF 
XNYQTLKTSLASGAQVNPTrODNXNPHVTSRATQLTTEXlRELE 
NSLHEAKES AMHKESXI I KMQKEIiEVTI>JDI I AKLQAKVHESNKC 
LEXTXET1 QVLODKVALGAKPYKEE1 EDLKMKLGK JDLEJCMKNA 
KEFEKE1 SATKATVEYQKEV3 RLLRENLRRSOQAQDTSVI SEHT 
DPQPSNKPLTCGGGSGIVOKTKALILKSEHIRLEKEISKLKQON 
EOLIKOKNELLSWNQHLSNEVKTWKERTLKREAHKQVTCErJSPK 
SPKVTGTASXKXOITPSQCKERNLODPVPKESPKSCFFDSRSKS 
LPSPHPVRYFDNSSLGLCPEVCNAGAFSVDSQPVgPW/vRLFQGK 

dvp\eckto 


Lais 


23 


14bC 


SEbVMWTVONRESbGLLSFPWiTMVCCAKSTNEPSKMSYVKET 
VDRLLXGYDI RLR PC FGG P PVDVGMRJ DVAS IDMVSEVKMDYTL 

tmyfooswxdkrlsysgiplnltldnrvadolwvpdtyflndkk 

SFVHGVTVXITOMIRUlPDGTVLYGLRirrTAACMMDLRRYPl.DE 
CNCTLEIESYGYTTDDlEFYls'NGGEGAVTGVNXIEbPOFSlVDY 
KMVS XKVEFTTGA YPRLS LS FRLXRNI GYFI LQT YMPSTL I TIL 
SWVS FWI NTY DAS AARV ALG I TTVLTMTT I STHLRETLP K I PYVK 
AIDIYLMGCFVFV FLALLE YAFVNY I F FGXGPQKXGAS KODOSA 
NEKNKLEMNKV0VDAHGNILLSTLE2RNETSGSEVLTSVSDPKA 
TMYSYDSA£IQYRKPLSSRE\A*GRAP'JRKGVPSXGR1RRRAS\ 
OLKVKIPDLTDVNSIDKWSRMFFPITFSLFTJWYWLYYVH 


5816 

1 


861 


19: 


TSSRSRAAAOEGDAETPGSVERRGRRAGAEDGMSQAPGAOPSPP 
TVYHERQRLELCAVHALNNVL0QQ1>FSQEAADEI CKRLAPDSRL 
NPHRS LLG1GNYD VNV 1 MAALOGLGLAA VWWDRRRPI>SOLAL?0 
VLGLILNLPSPVSLGLLSLPLRRRKLRWPCARL/VTVSYYNLDS 
K\LRAPEGPGGLRTE\ ♦ GP FLAAALAOGLCEVLLVVTKEVEEKG 1 
SWLRTD 


j 5817 

1 

j 


651 


life 


RLFRGPGANRGRSC.^GCSGGREPSGGALPKRHCPC*PPSFPAAD 
VMSNTTV PNAPQANSDSMVGY VLGP FFL1 TLVGVVVAVVKYVQK 
:<KRVDRLRHHLLPMYSYDPAEELHEAEOELLSDMGDPKW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLLPLLSP 
GSPCWVLGU1FSLHPPSAASASHALT1TSI,PPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCrL 


5818 


3 


392* 


OALRDKLWI FLVQS f YAVRHTES WKLMSTDDQQKI QAAAFDKGD 
DRRLGKKPIFSSSOORKQVSDSGDIKIRSWRGNNKKECWSYLST 
NKKMKSDGLGASGHSSSTNRNSINKTLXODDVKEKDGTKIASKI 
TXELKTGGXNVS GKTKTVTKS KTENGDXARLENMS PRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQI SGARPKVLTGNLNVQAKAJC 

TEEEKPSGHKLSFCDSPGOMMKMSVDSVXNSTVAIXSRPVSRVT 
NGTSNXXSI HEODTNVWJSVLKXVSGXGCSEPVPQAI LXXRGTS 
NGCTAAQORTXSTPSNLTKTpGSOGESPNSVKSSVSSROSDENV 
AXlDHNTTTEKGAPKj?XMVXOVHTALPKVNAKIVAMPXNLNQSX 
XGETLNNKDS KQXM PPGQV3 S XTQPSSORPLXHBTSTVOXSKFH 
DVRDNNNXDS VSEOXFHKPLI NLASEISDAEALQSSCRP \ DPQK 
PLNDQEKEXLALECQNISKLPXSLXHELESKQICLDKSETXFPN 
HXETDDCDAA^ICCKSVGSDhTWSXFYSTTALXYMVSNPKENSI* 
NSNPVCDLDS TSAGQI HLI SDRENQVGRXDTWXQSS I KC V SDVS 
LCNPERTNGTLNSAOEDXKSKVPVEGLT J PS KLSDESAMDEDKH 



378 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spend i ng 

to first 

amino acid 

reeidue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid Strrnent cencainmg signal peptide ] 
{A=Alanine, C=Cyste:ne, D<=Aspartic Acid, E= 
Glutamic Acic, F= Phenyl alanine, G=Glycine, 
H=Histidine, I ---I sol euca ne , K-Lysine, 
L=Leucine, K-f<ethicndne , I*=Asparagine , 
P«Prolinc, C^Glutaw.ine , R^Arginine, 
S = Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGOLSEKNSPKNMETTSESPESHETPETPFVGK 
WNLSTGVLHQRE£PESDTG£ATTSSDC 1 KPRSEDYDAGGSQDDD 
GSNDRGJSKCGT.^LCHDFLGRSSSDTSTPEELKIYDSNLRIEVK 
MK*<OSSNDLFOVK c TSr>DF] PPKRPFIK^R^ATVH^RFRFNTPR 

GSVQFAOEIDQVS'SSADEIIDERSEAENVAENFSISNPAPCOFQ 
GIINLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSF^PSDVPDGJSHEHHGRTCYSRFSRESEDNILE 
CKQNKGHSVCXNESTVI>DLSS 1 DSSRXNKQSVSATEKKNT1DVL 
SSRSROLLREDKKVNNGSm/fNDIOORSKFLPSDVKSOERPCHL 
DLHOREPNSDIPK>:SSTKSLDSFRSOVLPQEGPVKESHSTTTEK 
AN I AI»SAGDI DDCDTLAQTRM Y DHR ? S XTLSP I YEMDV 1 EAFEQ 
KVESEPHVTDMDF ♦ DDQHFAK0DWTLLKQLLSEOPSNLDVTNSV 
PEDLSLAOyblNQTLLLARDSSKJPOGITHIDTLNRWSELTSPLD 
SSASl TKAS FS S FDCS PQGEKTI LELETQH 


$819 


1 


55S'; 


AAAGLLGALHLVKTLWAAARAEKEAFVQSESI I EVLRFDDGGL 
LQTETTLGLSSYO0KSISLYI?GNCR?3RFEPPMLDFHEQPVGMP 
KMEKVYL1INPSSE * Tl TLVS1 FATTS HFHAS FFQNRK J L PGGNT 
SFDVS /VFIiARWGhTVENTLFI NTSNHGVFTY\QVFGVGVPNPY 
R LR PFLGARVTVNS S FS PI I N 1 H NPHS E PLQWEMYSSGGDLHL 
E1»PTG0C<5GTRKLK'E 1 PPYETKG VMRAS FSSREADNHTA FI R I K 
TNASDSTEFIIliPVEVEVTTAPGIYSSTFMLDFGTLRTQDLPKV 
I.NI.HLLNSGTKDVFITSVRPTPQ\NDAITVHFKPITLKAS\ESK 
YTKVASISFDASKAKKPSQFSGK1TVKAKEKSYSKLEIPYQAEV 
LDGYLG FDHAATL FH JRDS PADPVER P I YLTNTFS FAIL I HDVL j 
LPEEAKTMFKVHNFSKPVLI LPNESGYI FTLLFMPSTSSMHIDN 
m LLITNASKFHIPVRVYTGFLDYFVLPPK1EERFI DFGVLSAT 
EASNI L FA 1 1 NSNP 1 ELAI KS WH 1 1 GDG \LSJELVA VDRGNRTT 
1 1 SSLPECEKSSSSDOSSVTLASGYF\AVFRVKLTAKKL\EGIH 
DGAlQITTDYEll/H FVK\AV 1 AVGSLTCSPKHWLPPSFPGK1 
VHQSLN I MNS FSQKVKIQQI RSLS EDVR FY YXRLRGNKEDLEPG 
KKSKIANIYFDPGLOCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 
WDADWDLHCSLFKGWTG 2 KEN SGHRLS A 1 FEVNTDLQKN 1 1 S XI 
TAELSWPSILSSPKKLKFPLTNTNCSS\EEElTLENP/SOPVPV 
YVQFIPLALYSNPSVFVDKLVSRFKLSKVAKIDLRTLEFOVFRN 
SAHPL0SSTGFMEG\1^PHL1LNL1LXPGEKKSVKVX\FTPVMN 
R7 VSSL.I I VRNNLr^MDAVmOGCGTTENL.RVAGKLPGPGSSLR 
FK I TEALLKDCTDS LKLRE PN FTL KRTFK VENTGQ1X?I H I ETI E 
1 SGYSCEGYGFKWNCQEFTLS ANASRD1 1 ILFTPDFTASRV1R 
ELKPI TTSGSEFVF I LNASLP YHMLATCAEALPRPNWELALY 1 1 
I SG IMS ALFLI>V1GTA\ YLEA0G1WEP\ FRRRLS \ FEASNPPFD 
VGRPFDLRR1 VG3 £S EGNLNTbSCDPGHSRGFCGAGGSSSRPSA 
G SH KQ * G ? S GH PH5 £ H SNRN S AD VDD VRA YNSGRTS SMTS AOAA 
S SOPANXTRPLVI.DSNTGAOGKS AGR KS KGAKQSQHGSQHHAHS 
PLE0HPQPPLPPPVP0POEP0PERLSPAPLAHPSHPERASSARH 
S SEDSDI TSLI EAKDXDFDHHDS PALEVFTEQPPSPLPKSKGKG 
K PLCRKVK P PKKQE E X EKKG KG KP QEDE L KDSLADDDS S S TTTE 
TSNPDTEPLLKEDTE KQKGXOAMPEKKES EMSQVXQXSKXLLN I 
KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPliPTAMTSGSK 
SRNAOKTKGTSKLVDMRPPALAKFLPNSQELGNTSSSEGEKDSP 
PPEWDS VPVHKPGS5TDSLYKLSLGTLNADI FLKQRQTSPTPAS 
PS PPAAPC? FVARGS YSS I VNS S SSSD P K I KQPNGS KHKLTKAA 
SLPGKKGNPTFAAVTAGYDKSPGGNGFAKVSSNKTGFSSSLGIS 
HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 
VFSKLGLSRSCNQASORSWNEFUSGPSYLWESPATDPSPSWPAS 
SG5PTHTATSVLGNTSGLWSTTPFSSS I WSSNLSSALPFTTPAN 
TLASIGLMGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 
RSSDPWSNSHFPHEK 


5620 


310 


1270 


R VS LSGPVSLGVLLCARSSTMGKR DNRVA YMNPI AMARSRGPI Q 
SSGPTIQ\VI * IDOC LPGKK* KSN * KRKR K/DSKALAEFEEKMN 
ENM KKELEKHREKL1S GSESS S KKRQRKK KEXKKSW ♦ \DSSSS\ 
SSSSDSSSSSSDSEDEDKKC^KIIRKKKKKRSHKSSESSMSETES 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
locEt ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eot ide 
locat ion 
cor responding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment, containing signal peptice 
(AsAlanine, (^Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F^Phenylalanine, G*Glycine, 
K=Histidine, 1 = Jsoleucine, K=Lysine, 
L» Leucine, K-Methionine, N- Asparagine , 
Pxprolir.e, Q-Gl utamine , R*Arc-iniae, 
S=Serine, T=7hreo:iine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *^Stop 
Codon, /-possible nucleotide deletion, 
\cpossible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDI KGLSXKRKMYSEDKPLSSESLS ' 
ESBYISEVRAKKKKS5EEREKATEKTKKKKKHKKHSKKKKKXAA 
SSS PDS F*K* EKSGFPY KESAMSEE I STVKTTT YLLKCMN FLVF 
G1IPGLFSSHSDATV 


SB21 


179 


915 


KWR^OSWRWPKPGTNWMIvSCSVCWRRVTWTGSVVIMRKLGKHPOT ' 
PT/IKDCSIAATCKRPSARFPHQRRKKRREMDDGLAEGGPORSN 
TYVIKLFDRSVDLACFSENTPLYPICRAWMRWSPSVRERECSPS ' 
SP1>PPLPEDEEG\SEVTNSKSR*CVOACFPTHTPGGOPKNACR\ 1 
SRI PSPLAALRMOGT P* RWSPFEPEP SPSTLI YRNMQRWKRI RO 
RWKEASHRNQLRYSESMKILREMYERQ 


S822 


464 


4379 


OTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMV/TGGCRRi 
PVLVFHADA 1 LTKPIW3 R VIGERYHLS YK1 VRTDSRLVRS I LTA 
HG FHEVHPSSTD YN LMWTGSHLKP PLLRTLSEAQKVNH F PRS YE 
LTR KDR LYKN 1 1 RMQHTHG FKAFH I LPQTFLLPASYAE FCNS YS 
KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRyiNNP 
LL2DDFXFDVRLVVLVTSYDPLV1YLYEEGLARFATVRYDOGAK 
NIRNQFMHLTNYSVNKKSGDYVSCDDPEVEDYGNKWSMSAMLRY 
LKQEGRDTTALMAKVEDLJIKTIISAEIiAIATACKTFVPHRSSC 
FELYGFDVLIDSTLKPViLLEVNLSPSLACDAPLDLKIKASMISD 
MFTWGFVCODPA0RASTRPIYPTFESSRRNPFQKP0RCRPLSA 
SDAEMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 
RGGFIR I FPTSETWL I YGSYLEHKTSMNYMLATRLFOURNTADG 
APEI>KI*SIA*SKAKLH^ALYERKLLSLEVRKRRRRSSRbRAMRP 
KYPVITQPAEKNVKTETESEEEEEVALDNEDEEOEA50EESAGF 
LRENOAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETOE 
LEPKFNLMQI LODNGNLSKMQARIAFSAYLQHVOI \RLMKDSGG 
OTFSASWAAKEDEOHELWRFLKRASNNLOHSLRMVl.PSRRLAL 
LERTRIUVHQLGDFI I VYNKETEQWAEKKSKKICVEESEEDGVNM 
EN FQE FI RQASEAELEEVLTFYTQKNKSASVFLGTHS KI £ KNNN 
KYSDSGAKGDHPETIMEEVKIKPPK00OTTEIKSPKLSRFTTSA 
EKEAXLVY SNS S SGPTATLQKI PNTHLS S VTT S DLS POPCHH S S 
LSOIPSAIPSMPHQPTILLNTVSASASPCbHPGACNIPSPTGLP 
RCRSGSHTIGPFSSFOSAAHIYSQKLSRPSSAKAGSCYLNKHHS 
G I AKTQKEGEDASLYS KR YNOSMVTAELORLAEKOAARQYS PSS 
HI NLLTQQVTNLNLATG IIKRS S AS AP PTI >RPI1SP SGPTW STQ 
SDPQAPENHSSSPGSRSl^QTGGFAWEGEVEIWVYSOATGWPOH 
KYHPTAGSYQLOFALOOLEOQKLQSROLLDOSRARHOAIFGSQT 
LPNSH LV3TMMNGAGCR I SSATASGQKPTTLPQKW P PPS SCASL 
VPKPPPNUEQVLRRATSOKASKGSSAEGQI'NGLOSSLNPAAFVP 
ITSSTDPAHTK3 MNKKHTEKQPVHHSWVHD 


6823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGPVNMDDPKKEDILLLADEKFDF 
DbSLSSS SANEDDEV F FGPFGHKERC3 AASLELNK PVPEQP PLP 
TSESPFA WS PLAGEK FVEVYKEAHLLALHI ESSSRWQAAQAAXP 
EDPRSCGVE3FI0ESX?\KINLFEKEK£MKKSPTSLKRETYYLS 
DS PLLG P P VGEPRLLAS S PALPS SG AO AR LTRAPG P PHSAHALP 
RESCTAHAASOAATQRXPGTKLLLPRAASVRGUGI PGAAEKPKK 
EI PASPSRTKI PAEK ES HRD VLP DKPAPGA VKVPAAGSHLGQG K 
RAIPVP\NKLGLKKTLLKAPGSYSK\liQRKSSSGA\VWSGASSA 
CTPQP VAKAKS SEFAS I PAN* LPGLCPNI S KS \GRMG PAMLRPA 
L\PAGPVG\ASSWOAKRVDVSEliAAEQLTAPP\SASPTQPOTPE 
GGG\QWLNSSCAW£ ES S OLNKTRS I RRKDSCLNSKTKVMPTPTN 
QFK I PKFS I GDS \ PDS STPKLSRA0RPOSCTSVGRVTVHSTP VR 
RSSGPAPOSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEFRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPOALNFSPEESDSTFSKSTATEA'ARBEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 
S R PLI DLMTNT P DfWKT^VA KPS P WGQLI DLSS PL I QI 5 PSADK 
ENVDSPLLKF 


5824 


42 i 


22S3 


LLTALSMEGGGGPJ)EPSACRAGPVNMDDPKKEDILLLADEXFDF 
DL^LSSSEANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPLAGEKFVEVYKEAHLLALH3 ESSSRNQAAQAAKP 
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SEQ 
ID 

MO ; 


Predicted 
beginning 
nuc 1 . eot ide 
locat ion 
corresponding 
to first 
amino acid 
reeidue of 
amino acid 
sequence 


J : red: cted end 
r.uaeot ide 
locat Jen 
corresponding 
to fjrst 
amino acid 
residue of 
smino ecid 
sequence 


Am-i no c-cici segment containing signal peptide 
(A-Aianine, C=Cysteine, D=Aspartic Acic, 
Glutamic Acid, F>. Phenylalanine, G=Glycjne, 
H-Hastidine, I = 1 scleucine , K=Lysine, 
L-Leucine, M*Met hionine , N=A3paragine , 
P=Prclme, O-Glutomine, R^Arginine, 
S=Scrine, T=Threcni ne, V=Valine, 
W<= Tryptophan, Y=Tyrosine, X=Un5cnovm, *=stop 
Codon, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion) 


1 


! 


EDPKSOGVERFIQESKF\K3NLF£KEKEMKKSPTSLttRETYYLS 
DSPLLC P P VG EPRLLA S S PALPS SGAQAR LTRAPG PPHSAHAL? 
RESCTAKAASQAATQR KPGTKLLLPRAASVRGRGI PGAAEKPKK 
El PA5PSRTKI PAEKESI1RDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVF\N»KLGLKKTLLKAPGSYSN\LORKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAN* LPGLCPN1 SKS\GRMGPAMLRPA 
L\PAGPVG\ASSWOAKRVDVSELAAEOLTAPP\SASPTQPOTPE 
GGGXQWLNSSCAWSESSOLNKTRSIRRRDSCLNSKTKVMFTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAORPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\ C V PAR R R S SEPR KN S AMRTB PTRESNR KTDSR \ LVD VS PDR 
GSPP?RVPOALMF£PEESDSTFSKSTATEVAREEAXPGGDAAPS 
EALLVD1 KLEPLAVTPDAASQPLI DLPLIDFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKP5PWGQL1DLSSPL3QLSPEADX 
ENVDSPLLKF 


5B2S 


2 

■ 


421D 


FLOIESASPAPKSSGFLAAHPHSPGGSLATKGRSRLSAPGMLHL 
SAAPPA P P P E VTATAR PCLC S VGRRGEGG KMAAAGALER SFVEL 
SGAER ER PRH FREFTVCS I GTANAVAGAVK YSESAGGF Y YVESG 
KLF5VTRNRFIHWKTSGDTLELMEESLDINLLNNA1RLKFQNCS 
VLPGGVYVSETQNRVI Z I :MLTNQT VHRLLLPH PSRMY R S ELWD 
SQMOS I FTDIGXVBFTDPCN'YQLIPAVPGl SPNSTASTAWLSSD 
GEALrALPCASGGIFVLKl>PPYDIPGMVSWELKQSSVMORLl>T 
G WMP TA 1 RG DOS PSDR PLS LAVHC VEHDAF I FALCQDH KL-RMWS 
Y KEQMC LMVADMLEYV P V K KDLRLTAGTGH KLRLAYSPTMGLYL 
GI F\ r/*HA PKRG0FC1 FQLVSTESNR Y SLDH I S S LFTSQETLI DF 
A-LTSTDI WAl»WHDAEN0TVVKYINFEHtJVAG<3WNPVFt«10PLPEE 
El VI R DDQP PREMYLQS LFTPGQFTN E ALCKALQI FCRGTERNL 
DLSVJSELKKEWIAVENELOGSVTEYEFSQEEFRNIXK3EFWCKF 
YACCLOyCEALSHPLAbHLNPHTNMVCLLKKGYLSFLIPSSLVD 
HLYLLPYENLbTEDETTl SDDVD1 ARDVICLI KCLRLI EESVTV 
DMS V J MEMS C YNLQS P E KAA EC I LEDMI T I DVENVMED • CS KLQ 
ElRNPIHAIGLLIREMDYETEVEMEKGFNPAOPLNIRMNLTQLY 
GSNTAGYIVCRGVHKIASTRFLICRDLLILQCLLMR1GDAVIWG 
TGQLFQAQCDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LCNLS VLELTDSGALMANR FVSSPQTI VELFFQEVARKHI I SHL 
FSOPKAPLSOTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 
GNCO Y VOLQDY I <?LLH P WCO VNVGS CR FMLGRC YbVTGEGQKAL 
ECFCOAASEVGKEEFLDRLTRSEDGEIVSTPRLQYYDKVLRLLD 
VI GLPELVIQLATSAI TEASDDW\KSQATX\RTCI FKHHlADLG 
\HNSOAYGSL* PQI PDSSROLDCLROLVWLCERSOLQDL.VEFS 
YVNUINEWGIIESRARAVDUfTHrTYYELLYAFHIYRHNYRKAG 
TVMFEYGKRJW3REVRTLRGLEKQGNCYLAALNCLRLIRPEYAWI 
VQPVS G A V YDR PGASPKRNH DGECTAAPTNRQI EI LELEDLEKE 
CSLAR I R LTLAQHD ?S AVA V AG S S 5 AEEM VTLLVQAGLFDTA I S 
LCCTFKLPLTPVFEGLAFKCIKL0FGGEAAQAEAWAWLAAN01>S 
S VI TTK ES S ATDEAWR LLS T YLERYKVQNNLYHHCVINKLLSHG 
VPLPNWLINSYXKVDAAELLRLYIJ^DIXDLTPYOVIRICCC 


5826 


3 


871 


KSOLLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEOQRQLKKQKNR 
AAAQRS RO KHTDKADALHQOH ESl»EKDNIiAliRKEI QSLQAELAW 
WSRTLHV^ERLCPMDCASCSAPGI^CWDQAEGLLGPGPCGOHG 
CREQLELFQTPGSCYPAOPLS PGPQPHDS PSLLQCPLPSLSLGP 
AWAEPPVOLSPSPLLFASHTGSSLQGSSSKLSALQPSLTAOTA 
PPOPLKbEHPTRGKlX5SSPDNPSSAU5LARLOSREHKPALSAAT 
WQGLWDP S PH FLLAFPLLS S AQVHF 


5827 


194 


2287 


GMGSEKSAXXSYT1J?EPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVKKAAKVP* * HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLEVALETLSSAEVCAGI YDILLALI FLHDRGHLTHNNVCb 
SSVFVSEDGHWKLGGMETVCKVS0ATPEFLRSIOS1RDPAS1PP 
EEMSPE FTTLP ECHGHARDAFS FGTLVESLLTI LNEOVSADVLS 
S FQQTbHSTLLNPI PXWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEE KTE FFKFLLDR VS CLSEELIASRLVPLLLNQLVFAEP 



38J 
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SEQ 
ID 

NO: 


Pred-ictec 
beginning 
nuclect id<- 
location 
corresponding 
to tirst 
amino acid 
residue ci 
amino acic 
sequence 


Predicted end 
nuclect i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino odd segment containing sior.ai peptide 
!A=.Vien:ne, C^Cysteine, D=Aspartic Acid, E=-' 
Glutamic Acid, F=Phenylalanine , G=Giycinc , 
H-Hi^ti dine, 1= I scleucine, K-Lysir : t, 
L=beucine, M=M«t hionine, N=AspE.rac i nc , 
P=Froime, Q=Gi\ir. amine , R^-Arginine, 
S=Serine , TV Threonine, V=Va3ine, 
W-7rypt cphan, Y=Tyrosine, XsUnknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possibie nucleotide insertion) 








vav\ksflpyllgpkkdhack5etpcllspa:.fosrv:pvllqlf 

E VH E EH V R M Vb US H J t'A Y VGA I »S bR EQLK KV \ I L \ PQ VLLG \ LR 
D\TSPS1 VAITLHSLAVbVSbbGPEVVVGGE3TK) FKRTAP\SF 
TK\NTDL?LEGDPFSgPIKFPINGLSDVKNTSED£EN7PSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDrA'KSOCTTLDV 
EESSWDDCKPSSLDTKVNPGGGITATKPVTSGEQKPI PAbbSL? 
EESHPWKSSLPOKISbVORGDDADOIEPPKVSSORRPLKVPSEL 
GLGEF.FT5 OVKKKPVKDPEMDVJFADMI PEIKPSAAFbl LPELRT 
EM V PK KD U VS P VMQF SS K FAAAE I TEG EAEG WEEEG EbNWEDNN 
W 


5826 


2 


2S7 


AREGGSLGAVAACCEbSYSCDFCPARFHTSWl/TRFVKMEFQAVV 
MAVGGGS R MTDLTS S 1 PKPbLPVGNKFbl WY PbNLbERVGFEEV 
I WTTPX VQKALCAE FKMKMKPDI VCI PDDADMG7ADSLRY I Y P 
KLKTDVI.VT.,SCDLITDVALJiE\ r VDLFRAYDASLAf*3L»NRKGOEtS2 
EPVPGOKGKKKAVEORDFIGVDSTGKRbl.FMANEADLDEELVIK 
GSI bOKBPR I RFHTGLVDAHLYCLKKY 1 VUFbMENG\S ITS IRS 
EL\IPYLV/RGKOFS£ASSQOGTRKEKEGGSKGKRGLKSFRISY 
£FY*KEANYTGTGAPY\D\ACWI 


5829 


260 


1255 


PDGRL1 V$ CSEDKT I Kl WDTTNKQCVNNF5DSVGFANFVDFNPS 
GTCI A5AG SDQTVKVWDVRVKKLL0HYQVHSGGVNC1 S FHPSGN 
YL»] TASSDGTLXIbDbbKGRLJYTbQGHTGPVFTVS FSKGGELF 
ASGGAnTOVLLWRTKFDELHCKGLTKRNLKJ?bHFDSPPHLbDiy 
IRTPHPh'EKKVETVEDFFLHLLRLIQSbR*SlCRSbLPLLWISF 
LbIbPO^GKPWG!.COTRVKRPVDIS*TLP*CH0NVCOQPRKRK 
CKT* VTSPVKVX/VSI PLAVTDALEHIMEOLNVLTCTVS I bEQR 
LTLTEOKLKDCbSNQOKLFSAVOOKS 


S830 


449G 


3139 


GGKMAAPEERDbTQEQTEKLbQFODbTGlESMDOGRKTLEOHNW 
NJEAAVODKbNEQEGVPSVFNPPPSRPbQVNTADHRIYSYWSR 
POPRGbbGWGYYLIMbPFRFTYYTIbDI FRFAbRFlRPDPRSRV 
TDPVGDIV5FMHSFEEKYGRAHPVFYOGTYS0ALNDAKREbRFL 
bVYU'.GDDHODSDEFCRNTLCAPEVISblNTRMLFVACSTNKPE 
GYRVSQ/vbnENTYPFbAMlMLKDRRE*PV\VGRLEGLl\OPDDL 
INObTFIMrANOTYLVSERLEREERNQTQVLROOODEAybASbR 
ADQEKERKKREERERKRRKKEEVOOQXLAEERRRONbOEEKERK 
LECUFFEPSPLDPESVKIIFKLPNDSRVERRFHFSOSbTVIHDF 
bFSLKES P\ EKFQIEA\NFPRR\VLPCI PSEE\ K?NPPTbO£\A 
GLSHTEVbF VQDbTDE 


5831 




2 897 


FCSKDKCCLYbPDS3NKSKSCrAKPGAHS0DRHAVMDSERQVKD 
TDD 1 ES PKRS 3 RDSGYI DCWDSERSDSLSP PRKGR DOS FDS LDS 
FGSRSR QTPS PD WbRGS S DG RGS DS ES DL PHR Kb PDVKKDDMS 
ARRTS KG EP KS AVPFNOYbPNKSNQTAYVPAPbRK KKAER EE YR 
KJSWSTATSPAGbGKKAJuODYGPRT\PVS\DDAESTSMFDMRC3E 
EAAVQPHSRARQEQbQL J NNQLREEDDKWQDDbAR WKSRKRSVS 
ODblKKEEERKKMEKLbAGEDGTSERRKSIKTYREIVOEKERRE 
R£UiEAYKNARSOEEAEGILCX?Y3ERFTISEAVbERbEMPKlLE 
RSHSTEPNKSSFbNDPKPMKYbROQSbPPPKFTATVETTlARAS 
VLDTSKSAGSGSPSKTVTPXAVPMbTPKPYSOPKKSODVLKTFK 
VCGKVSVNGETVHREEEKERECPTVAPAHSLTKSOMFEGVARVfl 
GSPLELKCDNGSIEJNIKKPNSVPOELAATTEKTEPNSOEDKND 
GGKS R KGN 1 ELAS S EPQHFTTTVTRCS ?TVAF VE F P £ S PObKND 

uercimnvi/orMrMfir'in/PT vr.covwif Dtfccr'DFTv'rT tpdpt.t^ 
V i?i,fc AU^)\KJ"^tMtnouKvi.ljVlj^VAV VRrlVoFtrbAlLil r ft jjU 

KMPEAKOLHLPNbNSOVDSPSSEKSPVTTPFKFWAWDPEEERRR 
OEKKCK)EQ£RLLQERYQ\KEQDK\bKEE\WEKAQKEVEEEERRY 
YEEEP* 1 1VEDPWPFTVSSSSADOLSTSSSKTEGSGTMNKIDL 
GNCODEKODRRWKKSFOGDDSDLLbKTRESDRLEEKGSbTEGAL 
AMSGN F VS KG V>!E DHQLJDTEAGAPKCG TNPQbAQDP S QNOQTSN 
PTHSSEDVKPKTbPLDKSlNHQIESPSERRKSlSGKKbCSSCGb 
PbGKG AAf5 1 1 ETbNLY FH IQCFRCG \ I CKGOLGDAVSGTDVR 1 R 
NGLLNCNDCYMRSRSAGCPTTL 


5832 




829 


PGRRFRHGSCAFQKQCIMLHICOYFbQGECKFGTSCKRSHDFSN 
S ENbE K bEKbGKSSDbVSRbPTI YRNABDI KNKSSAPSRVPPbF 



382 
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SEQ 
ID 
NO: 


Predjcter. 
beginninc 
nuclcot iric 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predictec end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino acid segment containing saynal peptice 
»A=Alanjne, C=Cysteine, D=Asparl"ic Acid, 
Glutamic Acid, F=phcnyl alanine , G=G3ycine, 
H=Histid:ne, 1= Isol eucine, K=bysine. 
L^Leucane, M=Me.thi onane. N=Asparag; ne , 
P=Proline, Q^Glutamine, R=Arcmi nc- , 
S=Serine, T=Threonine, V^Valint, 
W-Tryptophan, Y~Tyroaine, X=Unknown r *=Stop 
Codon, y=possible nucleotide deletion, 
\=possible nucleotide insertion* 








VP0GTSERKVDSSC7SVSPNTLS0LEGDC1CLYHIRKSCSF0DKCH 
RVHFHbPYKWQFbttKGKWEDbDNMEblEEAYCNPKlERlbCSES 
ASTFHSHCLJJFNAMTYGATQARRLSTASSV7KPPHF1LTTDWIW 
yWSDEFGSWQEYGROGTVHPVTTVSSSDVEKAYLAY/WyTGV*R 
PGSHbEVPCRKAQbRVRFQSbRSEKPGL'KHN * XGbPQTQI R\AP 
QDVTTMQTCNTKFFGPKSIPDYWDSSALPDPGF0K1TLSSSSEE 
YOKVWNbFNRTLPFYFVOKIERVONLALWEVyOWOKGQMQKONG 
GKAVPER0LFHGTSAIFVDA1C0QNFDKRVCGVHGTSYGKGSYF 
AR DAA YSHH Y S X5DTQTHTMFLARVLVGEFVRG>3AS F VR PPAKE 
GWSNAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIOYTTSSKPSV 
TPSILLALGSLFSSRO 


B833 


17C 


3289 


SILCbLSPCWQFGKPWSILSSRSRHSPCTKXGWEGMRKiiLiHT 
RQGHK * VHVE I S KALW VYRDDYF3 RHS I £ VSAVI VRAW I THKYR 
GRDWNVKWEEN1.LHAVAJCNYTLL0TIPPFERPFKDHCVCLEWNM 
G Y 1 WNLRAN R 1 PQC P LE ND WAULG FPYASSG ENTGI VKKFPRF 
RN R E LE ATR RQRMUYPV FT VS L W L Y LUHY CK ANLCG I LY F VDS N 
EMYGTPSVFLTEEGYLHIOMHLVKGEDLAVKTKFIIPLKEWFRL 
D1SFNGG01WTTSIGQDLKSYHNQTISFREDFJIYNDTAGYFII 
GGSR YVAG I EG FTGPLKY YRLRSLHPAQ J FNPLbEKQLAEQI Kb 
YY ERCAEVOE I VS VYASAAKHGGERQEACHDINSYLDLORRYGR 
PSMCRAFPKEKELKDKHPSbFOALLEMDLLTVPRNQNESVSEIG 
GIG FE XAVKR bS S X DGLHQ I S S I VP FbTDS S CCG YH KAS YY LAV 
FYETGLNVPRD0L0GMLYSLVGGOGSERLSSMNl>GYKHYOGIPN < 
YPLDWELSYAY YSNIATKTPbDQHTLQGEOAYVETIRI.KDDEIb 
KVOTKEDGDVFMWbKHEATRGNAAAOQRl^AONLFWGOOGVAKNp 
EAAIEWYAXGALETEDPALlYDYAIVbFKGOGVKKNRRLALELM 
K XAAS KGLKOA VNGLGW Y YHKF KKN YA\KAAK Y WbKA\ EE \ MGN 
PDASYNLGVLHLDGIFPGVPGRNOTLAGEYFHKAAOGGKNEGTL 
WCSbY Y 3 TGNbETFPRDPEKAWWAKHVAEKNGYbGHVJ RKGU4 
AYLEGSWHEALLYYVLiAAETGIEVSQTNLAKJCEERPDLARRYL 
GVNCVWRYYNFSVFOIDAPSFAYLKMGDLYyYGHONOSODLE^S 
VQMYAOAALDGDSOGFFNbALLl EEGTI 1 PXH3 bDFbE 1 DSTI.H 
SNNISIIiQFbYERCWSHSNEESFSPCSLAWLYLHbRbbKGAILH 
SAL1 YFLGTFl.bS I LIAW7VQYFQSVSASDPPPRPSQASPDTAT 
STASPAVTPAADASDOBOPTVTNNPEPRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSbGQGNSQGPFRTPKPPRT/QECG 
SAAPGPI PGQSSS * VPbRbEQIQQ KADCPbSLEbAbKPRMAAQV 
TbEDAXSNVDbbEEbPLPDOQPClEPPPSSbLYQPNFTTrNFEDR 
NAFVTG IARYI EQATVHS SMWEMbEEGQEYAVMbYTWRSCSRAI 
PQVKCNKQPNRVEIYEKTVEVbEPEVTKbMNFMYFQRNAIERFC 
GEVRRbCHAERRKPFVSEAYLITLGKFINMFAVbDELKNMKCSV 
KNDHS A Y KRAAQFbR KMADPQS I QESQNLStfFLANHNKI TQSLQ 
QQbRVI SG Y EELLADI VNbCVDYYENR MYLTF S E KHMbbKVMGF 
GbYLMDGSVSN 3 Y KbDAK KR INL»S K 3 DKYFKGbQ VVP LFGDMQ I 
EbARYIKTSAHYEENKSRWTCTSSGSSPOYNlCEOMIOIREDHM 
RFI SELAR YSKS E WTGSGROE AOKTDAEyKKLFDLALOGLCLL 
SOWSAHVMEVYSWKbVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFAbVEVI AMI KGbQVbMGRMESVFNHAl RKTVYAALQDFSQ 
VTLNEPLRQA1 KKKKNVI QS VbQA I RKTVCDWETGHEPFNDPAL 
RGEXDPKSG*D3KVPRRAVGPSSTQLYI<VRTMLESbIADKSGSK 
XTbR S S L EG PT I LD I EK FHR ES FF YTI lb I NPS ETLQQCCDLSQL 
WFREFFbEbTKGRRI QFP I EMSMPWlbTDHJ bETKEASMMEYVL 
YSbDbYNDS AH Y AbTRFN KQFbYDE lEAEVNbCFDQFVY KbAJX) 
IFAYYKVMAGSbLLDKRLRSECKNQGATIHLPPSNRYETbbKOR 
HVQLLGRS I DbNRbl TQR VSAAMYKSLELA3 GRFES EDbTS I VE 
bDGLLE I N RMTH KLbS R Y LTLDGFDAMFR E ANHNY S AP Y GR 3 TL 
HVPWSLNYDFbPNYCYNGSTNRFVRTVLPFSOEFORDKOPNAQP 
OYLHGSKALNLAYSSIYGSYRNFVGPPHFOVlCRLbGYOGIAW 
MEEbbKWKSbbQGTI LOYVKTbME VMPK 1 CRbPRHEYGSPGl L 
EFFHKQLKPI VEYAEbKTVCFQNbREVGNAI bFCbbl EQSbSbE 
EVCDLLHAAPFON I bPRVHVKEGERLDAKMKRbES KYAPbHLVP 
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SEQ 
ID 

NO r 


Predictec. 
beginning 
nucleot iot- 
}ocation 
corresponding 
to first 
amino acic. 
residue of 
amino acic 
sequence 


Predicted enc 
nucleotide 
locaticr. 
corresponding 
to" firs: 
amino acid 
residue of 
amino acid 
sequence 


Air. i no ac:d segment contain mq clonal peptide 
(At-Alanine, C=Cysteine, 1 = /..Lp&rtic Acid, E- 
Glvtanic Acid, F- Phenyl ai«r.ine , G^Glycine, 
H=Ki st idj-ne, 1 = 3 sol eucine K=l.ysine, 

Leucine, M-Methionine, K-Mparagmt, 
P= Proline, Q=Glut amine, RrArrtir.inc, 
S=Scrine, ^Threonine, V«Vr.lint, 
W=?ryptophan, Y=Tyjosine, >; ^Unknown, *=Stop 
Codon, /=possible nucleotide deletion. 
\=possible nucleotide insertion) 








LI ER LGTPO01 AI AREGDLLTKERLCCCCSMFEV ILTR1RS FLD~~ 
DP 2 WRG PL F5 NG VMH VDECVEFHR L W S AMQFVYC I P V3TH E FTV 
EOCFGDGLKWAGCMIIVLLGQQFRFAVLDFCyHbLKVOKKDGKD 
EI1KNVPLKKMVERIRKF0ILNDE3]'2 I LDK YLKSGDGEGTPVE 
HVKCFQPPIKQSLASS 


5835 


420S 


1904 


SGNlRPiAOGSHCIPFQVLHDLRQKFFEVPEVWSRCMLOKNNNl. 
DACCAVLSQFSTKYLYGliGDIjaFSDDSGl.SGLRNHMTSLKXDLO 
£C>N] YHHGRFGSRMMGSRTLTHSISDGQLQGGOSNSELFOQEPO 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFKLGSKGTSSLSQQT 
PR FN P • M VTI A PN I QTG RNTPTS LH J I : C- V FP P VLNS PQGNS I Y I 
RPYITTPGGTTROTOOHSGWVSQFNrMNPOOVYOPSOPGFWTTC 
PASNPLSHTSSQQPNQCGHQTSHVYKPISSPTTSQPPTIHSSGS 
S0SSA1ISQYNI0NISTGPRKMQ1E1KULPPORNNSSKLRSSGPR 
TSSTSSSVNSOTLNRNCPTVYIAASFPNTDKrMSRSOPKVYISA 
NAATGDEQVMRNQPrLFJSTNSGASAASRNMSGQVSMGPAFIHH 
V:PP KSRAI GNN S ATS PRVWTQPNTN E YT FK I TVSPN KPPAVSP 
G^'SPTFELTNLLNHPDHYVETENIHHLTDPILAHVDRISETRK 
LSMGSDDAA YTC-DI *RI SNS WLGMVA 5 IACNS SALGGQDGR 1 1 * A 
QEFCTSWGNlWRI>RLYRRF*NYAGMVAiiTCSPSYSVD*ALLVHO 
K.^R^ERbORELEIOKKKLDKLKSEV^LllENNLTRRRLKRSNSIS 
01 PSLEEMQCLRSCNRQLQI DIDCLTKL 1 DLFOARGPHFNPSAI 
HNF V DN I G FVG PV P ? KJPKDQRS 1 1 KT ? X TOLH EDDEGAQWNC7A 
CTFLNHPAL1 RCEQCEMPRHF 


SB36 


363 


■230: 


FMITMCGICCSVNFSAEHPSQDLKEDL.LYNLK0RGPNSSKOI.bK 
SDVNyOCLFS7.XVLHLRGVLTTOPVF.DFRGNVFLWNGElFSGIK 
VcAEENDTQaLFNY^SSCKNESEILSLFSEVOGPWSFIYYQASS 
H YLWFGRDFFGRRS LLWHFSNLGK5 F CL SS VGTOTSGLANOWQE 
V P AS \D FS E L I LS LuS FP D ALFYNC 3 LG N I FLGR 1 LLXKML 1 A* 
VXFQOTYOH LYQR * 0MKPNC1 LKNLLF L » I * CCHKLHNRL2 AV I 
FPMCKLQERYFKSFLLMYT'KEVICOr.; OVLSVAVKKRVLCLPR 
DENLTANEVLKTCDRKANVAILFSGG1 ESMVI ATLADRHI PLDE 
P I DLLNVAFI A E EKTMPTTFNR EGNKO KJ>) KCE I PSEE FS KDVAA 
AAADSPNKH VSVPDR ITGRAGLKRLCA VSPSR I WNFVEINVSME 
ELOKLK RTR I CHLI R PLDTVLDDS IGCA VWFASRG2 G WLVAQEG 
V KS YQSNAK WLTG I GADEQI AG YSR1 : . R VRFOSHGLEGLN KE 1 M 
MELGRISSRNLGRDDRVIGDHGKEARFFFLDENWSFLNSLP1W 
EKANLTLPRG2GEKLLLRLAAV2LGLTASALLPKRAMOFGSRIA 
KMEKJNEKASDKCGRLOI MSLENLS1FKETKL 




47 9i 


901 


NGNAVAQAPVTNCCYLATGSKJDQTIR1 ViSCSRGRGVMILKLPFL 
KRRGGG 2 DPTVKERLWLTLKWPSNQPTQLVSS CFGGELLQWDLT 
OSWRRKYTLFSASSEGQNHSRIVFNLCPLOTEDDKQLLLSTSMD 
RDVKCWDI ATLECSWTLPSLGGFAYSLA FS SVDIGSLA1 GVGDG 
MIRVWNTLS I KNNY D VKN FWQGVKSKV T ALCWH PTKEG CLAFGT 
DDGKVGLYDTYSNKPPQISSTYHKKTVTTliAWGPPVPPMSLGGB 
GDS PSLALY SCGGEG I VLQHN PW KLSG L A FD I MKbl R DTNS I KY 
KLPVHTE3SWKADGKIMALGNEDGS1E1 FQ\1 PNLKLICTIQQH 
HKLVNTISWHHE\HGSPAQKLSYL\MFSGS(?C>CSPFTCHNLKNC 
P*KAAPBSPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH*WEGL 
VFCFFIDGYSPGCWD\AFPGKEAPVAJFRG\HOGRLLCVAWSPL 
DPDCI YSG\ ADDFCVHyOJLTSMQDHSR P PQGKKSI ELEKKRLSQ 
PKAKPKKKKKPTLRTPVKLESIDGNEFE5MKENSGPVENGVSDQ 
EGEEOAREPELPCGLAPAVSREPV3CTPVSSGFEKSKVTINNKV 
I LLK KEP PKE K PETL I KKRKARSLLPLS TS LDKRSKEELHQDCL 
VIJ^TAO<HSRELNEDVSADVEERFHLGLFTDRATLYRMIDIEGKG 
HLENGHPELFHQLWLWKGDLKGVLQTAAERGELTDNLVAr4APAA 
G YHVWLKAVEAFAKQLCFODQYVKAAS h LLS IHKVYEAVELLKS 
NH FYREAI AI A KARLRPEDPVLKPLYLi WGTVLERDGH YAVAAK 
CYU5ATCAYDAAKVLAKKGDAASLRTAAELAAI VGEDELSASLA 
LRCAQELLI^\WVGAQ5;AU)LHES1X)C-URLVFCXLELLSRHXE 
EKOLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 
OEAFOKLQNI KYPSATNNTFAXQLLLH3 CKDLTLAVLSQQMASW 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
i ocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Precicted end 
nucl eot irie 
loc&tior. 
ccr responding 
to f:ret 
amino acic 
residue oi 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=PhenyJaianine, G=c;lycine, 
H=Kistidine, I=lsoleucine, K=by£ine, 
L- Leucine, ^--Methionine , N*-- As par acme , 
P-Pioline, Q=G1 vtamine , R=Arginine, 
S=Ser"ine, T=Threoninc, V=Valine, 
W=Tryptophan , Y-7yrosine, X- Unknown , *=Stop 
Cocon, /=pocsib3e nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALbRAV^SYbSGSFTiMOEVySAFLPDGCDHLRbKiiGD 
HOSPATPAFKSLEAFFLYGRbYEFWWSLSRPCPNSSVWVRAGnR 
TbSVEPSQO^TASTEETDPETSO/PEPNRPSEbDbRbTEEGERM 
bSTFKELFSEKHASLOU SORTV AEVOETLAEMI ROHQKSObCKS 
TANGPDKNBPEVEAEOPbCSSOSOCKEEKNEPbSbPEbTKRbTE 
ANQRMAKFPES1 KAWPFPDVbECCbVbbbl RSHFPGCbAOSMCO 
OAOELLQKYGNTKTYRRHCOTFCK 


5838 

1 


110 


96 


KTMPHLbVTFRDVAIDFSOEEWECLDPACRDbYRDVMbENYSNb 
]£LDLESSCVTKKLSFEKEIYEMES\PSGR1WGNVST3TFQYNG 
UG DNMECKGNbEGOVS KS EGbYMCVK I TCEEKATESHSTSSTFH 
RI 3 /HYQGK3 VKCKECRQGFSYbSCblOHEEMKNl* KCSEVNKH 
RNTFSKKPSYI*HQ\KFRbGEKPYECMECGKAFGRTSDblOHQK 
IHTNEKPYQCNACGKAFIRGSOLTF.HQRVHTGEKPYCCKKCGKA 
F£ YCSQYTLHQR I HSGEKPYECKDCGKAFIbCSCLTYKQR I HSG 
EKPYECKECGKAr IbGSHbTYHORVHTGEKPYICKECGKAFI.CA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQbTYHLRVHSGERPY 
KCKECGKAFISNSNb30H0RIHTGEKPYKCKECGKAFICGK0bS 
EHQRIHTGEKPFECKECGKAFIRVAYbTCHEKlHGEKHYECKEC 
GKTFVRATQLTYH0R1 HTGEKPYKCKECDKAF/HLWLT1 LSEHQ 
RI KRGF.KP YECKOCGR / LF 1 RGSHL/N EHbRTHTGEKP YECKEC 
GRAFSRGSEHTLHORIH'i'GSKFYTCVQCnKDFRCPSObTQHTRL 
HN * F/T S SH K I CM H S I AbAS LDF AH bQE KN P EN 


5839 


1 


2421 


GRPFPRPPRAbPRbPLRGRROEGRWTVDFEECLKD\SPRFRAM- 1 

EEVEGDVAELEbKbX DKbVKbC 3 A\M I DTGKAFCVANKQFMNG I 

RD \ bAQNS \ NND A \ WET K FA P S FbDS bQE M 3 N FH T I b/ b * PNS 

EI K* GHS FQNFV K EDbRKFKDAKKQFENSQ* KRXX I ALVKN AP V 

PSRPASbEL*KPPNIbTATRKCFRH3AbDYVb03NVbQSKRRSE 

IbKSMLSFMYAKbAFFHOGYDbFSEbGPYMKDbGAQliDRbVGDA 

AK EKR EMEC KHST 1 QO KD FSR DDS KbK YNVDAANG I VMEGYbFK 

RASNAFKTWKRRWFS I QKTOOWYQKKFKDNPTVVVEDbRbCT VK 

KCEDI ERRFCFEWSPTKSCMbOADSEKI^ROAWI KAVQTSI \AT 

AYREKDDESEKbDKKSSPSTGSbDSGNESKEKbbKGESALORVQ 

CI PGNASCCDCGbADFRV?AS3NL»G3TbCIECSGIHRSbGVHFSK 

VRSbTbDTWEPEbbKbMCEbGNDVlNRVYEANVEKMGIKKPOPG 

0RQEKEAYIRAKYVERKFVDKIFb*SbSPP\E0OXK\FVSKSSE 

EfCRLSISKFGP\GDQVRASAOSSVRSNDSGIQOSSDDGRBSbPS 

TVSANSbYEPEGERODSSMFbDSKKbNPGbQLYRASYEKKbPKM 

AEAbAHGADVNVJANSEENKATPLIQAVLGGSbVTCEFbbQNGAN 

V>JQRDVQGRGPbHHATVbGHTGOVCbFLKRGAJIQHATDEBGKDP 

LS I AVEAANAD 3 VTbbRbAR M»iEEMR ESEGbYGQ PGDETYQD I F 

RDFSQMASNNPEKbNRFQQDSQXF 


5840 


698 


3610 


KHIJCLPROHLTTbWQ 3 £ S PR WRS PORAFMSAbSKTOTOSAPAbQ 
GbSS bbQS VTGNP VPAS EAASOSTS ASPANTTVY T I KGRK b PSS 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGbPSTAFKbP 
SNTKG FTATHKTS P AAFPTE VTI CQSSE VSKPKb\ ES ESTS PS b 
\SMK3HNFLKGNPGFSVA*NbKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSbb 
SKIlSPGSSTPSSTRSPPPGRDESYPREbSNSVSTYRPFGIjGSE 
S PY KQ PS DGME R P SS bM DS S QE KF Y PDTS FQEDEDY R DFE YS G P 
PPSAMMNbQKKPAKSIbKSSKbSDTTEYOPILSSYSHRAOEFGV 

SPSKNDSFFTPDSNHNSbSOSTTGHbSLPQKQYPDSPHPVPHRS 
LFSPONTLAAPTGHPPTSGVEKVbASTI STTSTI EFKNMXXUAS 
RKPSDDKHFGQAPSKGTPSDGVSbSNbTQPSbTATDQOXXJBEHY 
R I ETR VS S S CbDb PDS TE E KG A P I ETbG YHS ASNR RMS GEP I QT 
VES I R V PG KGNRG HGR EA£ R VG WFDLS TSGS S FDHG PS S AS E LA 
SbGGGGSGGbTGFKTAPYKERAPOFQESVGSFRSWSFNSTFEHH 
LPPSPbEHGTPFOREPVGPSSAPPVPPKDHGGIFSRDAPTHI.PS 
VDI^NPFTX^AlJUiAAPPPPPGEHSGIPFPTFPPPPPPGEHSS 
SGGSG VP FSTPPP PP PPVDHSG WPFPAPPbAEHGVAGAVAVF P 
KDHSSblWTl^EHFGVbPGPRDHGGPTQRDbNGPGl^SRVRESI, 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

ccr responding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleot ide 
iocatior. 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
Ftquence 


Amine acid secroenl containing signal peptide 
(A=Alanine, (^Cysteine, U=Aspartic Acid, £- 
Glutamic Acne, ^Phenylalanine, G*Glycine, 
H=tfistidine, 1-1 scleucine , K=Lysine, 
J,-J,eucine, K = f>:ethionine, N-Aeparagi nc , 
P*?roline, C =G1 utamine , R*>Arginine, 
S=Serine, -^Threonine, V= Valine, 
w -Tryptophar., Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=posc:bie nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHSLEHLGP^HGGGGGGGSNSSSGPPLGPSHRDTISRSGII 
LRSPRFDFRPRE?FLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5B42 


1908 


761 


GLRLFLVLTVWPKKKPSWLSRTEFSKRLIjCRTIjWCOSGWSSRSY 
TRSMLKWTTS INEKSRTS TKS TRTSAR PGLTATVS 1 GLS DS PTW 
RHCWMTARSCSGlKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 

gdg:,dsglarrg£-avsalasglveepmw;ppfhptprfkavsak 
s k fdl v£ ogfte ft i edfhnt fmdli eqvexqts vadllas fnd 
ostsdylwylrlltsgyliqreskffehfibggrtvkefcqnqe 

\VEPMCKESDHI HI 3 AlAQGLQR\mPGViEYMGPRPRAATTNPH3 
FP*CLP£PKVYLLYRPG\HYD1LYKIGLGSSPLGCPGCPLLARA 
I,GHCYRGFSVWKvsSYFTPFFLSHD?PPMFY 


5842 


307 


191* 


QEPTADFKLRSTCGCGREMTCPDKPGQLINWFICSLCVPRVRKL 
WS S R R PRTRRNI.1 .IjGTACA I YLG FLVSQ VGRAS LQHGOAA E KGP 

hrsrdtaepsfpeipldgtlappesogngstlqpmwyitlrsk 
rskpanirgtvkpkrrkkkavasaapgoealvgpslopqea\eg 
klml * hlgtlrectwlrlesdpggmcgvrb/wraggfdflopss 
resniriysesarswlskddirrmrlladsavaglrpvssrsga 
RIjLVLEGGAPGAVLRCGPS pcgllkqpldmsevfafhldr 3 LGL 
NRTLPSVSRKAEFlODGRPCPiaiiWDASLSSASNDTHESVKLTW 
GTY QQLLKOKCWCNGRY P KPESGCTE 3 HKHEWSKMALPD FLLQ1 
YNR bDTNCCGFR ? J . KE D AC VQNGLR P KCDDQGS AALAH 1 IORKH 
DPRHLV PI ONKG FFCRSEDNLNFKLLEGI KEFPASAVYVLKSQH 
LRQKL1»0SLFLDKGYKESQGGR0G1EKLIDV3EHRAKH,ITYIN ' 
AHGVKVLPMNE 


5843 


500 


1453 


GTARL.VTCW VLHGQ* VXXPAWEPGWWL* Q * RCR P KG WG LG AGM 
RSSRMS0PP0CLKRACSSCCHFMVKLLDDGTPM1PGEKVAHTSL 
DALVTFHOOKPIEPRRELLTOPCROKDPANVDYEDLFIiYSNAVA 
EEAACPVSAPEEASPKPVLCHOSKERKPSAEM/RQNNHOGSHFL 
LPPK1 PSWRDPPF.TLEE PON APRERPEGPAAAKKPPRHCBLWT 
LGCPEIHGDLRPWBRKROPRSLRGSHLGGORI^HGSLCGHISOKP 
LTAPGTKR0KGPHQ2GREVGQLH*GDPRGQEIAPNGSESP1LPG 
VQARAPGLGRA 


5844 


202 


24 71 


FDSAVLS S 1 NVMA VLPG PLQLLGVLI/TI SLSS I R hi QAGAY YG1 
KPLPPQI PPQMPPOI PCYOPLGOQVPKMPLAKDGLAMGKEMPHZ, 
OYGXEYPHLPOYWKE3 Q PA PR MG KE AVPK KGKE 3 PLASLRGEOG 
PRGEPGPFGPFGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGE3 G0XGEIGPMG2P* PQGPPGPHGLPG3GK 
PGGPG3>PGQPGPKCDRGPKGLPGPOGLRGPKGDKOFG«PGAPGV 
XGPPGMHGPPGPVGbPGVGKPGVTGFPGP\QGPt<GK\PGAPGEP 
GPC;GP 3 GVPGVOGF PG 3 PG3G KPGQDG\ I PGQPGFPGGKGEOG1* 
PGLPG P PGbPG 3 GK PG F PGP KGDRGMGGV PGA1X3 PRGEKGP I GA 
PGI GG P PGEPGLPG I PGPKGP PG AIGFPGPKGBGG1 VG PQG PPG 
PKGEPGbOGFPGKPGFLGEVGPPGMRGFPGPlGPKGEHGQKGVP 
GLPGVPGLLGPKGEPG I PGDQGLCGPPGI PG3GGPSGP1GPPGI 
PGPKGEPGLPGPPCFPG2GKPGVAGLHGPPGKPGALGPOGOPGL 
VGPPGPPGP PGPPA VMPPTPPPOGEYI>PDWGLGI DG VXPPHAYG 
AKXG KNGG PAYEMPAFTAELTAPFPPVGAPVXFNKLLYNGRONY 
NPQTG I FTCEVPGVY Y FAYHVHCKGGNVWVALFKNNEPVM YTYD 
EYXXGFLDQASGSAVLLLRPGDRVFLC^PSEQAAGLYAGOYVHS 
SFSGYLLYPM 


584S 


215 


2061 


HASNKSASLQGXKAK P KEKTAMCLVNELAR FNRVQPQY XLLNER 
GPAHS KM FSVQLS 1 GEQTWESEGSS I KKAQQAVGNXAI/T ESTLP 
KPI * KPPKSNVNNNPGCITPTVEIjMGIAMKRG\KPA3HRPt>DPK 
PFPNNRANYNFQVKYNQRYHCP 3 PKI FYVQLTVGNNEFFG5GKT 
RQAARh^AAMKAIjCALONEPIPERSPQNGESGKDMDDDXDANKS 
E I S L VFE 3 A1»KR Wi P V S FEV2 KESG P PHM K S FVTRVS VGEFS AE 
GEGNSKKLSKKRAA.TTV1>QELKKLPPLPWEKPX\HFFKKRPKT 
I VKAGPE YG0GMNF I SRIAOIOQAKKEKEPDYVLI-SERGMPRRR 
EFVi^VKVGN3?VATGTGPNK3<3AKKNAAI2AMLLQLGYXASrNW? 
DQLEKTGENKGWSG PKPG FPEPTNNTPKG3 LHLSPDVYQEMEAS 
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Amino acid stor.ient contair.-ng sicnal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Fhenylnl«nine, G=Glycir.e , 
H^Histidine, l=Iscleucine, K>=Lysine, 
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W« Tryptophan, Y* Tyrosine, X^Unknown, *^Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RHKVISGTTLGYLSPKDJWQPSSSFFS1SPT3NSSATJARELLM 
NGTSSTAEAIGLKGSSPTPPCSPVQPIT KQJ>EYI AR1QGFQVKYC 
DROSGKECVTCLTLiAPVQMTFHAIGS c IEASHDQV* YATAILLC 
Y G? AR K U KA I KMEAMCAHAALLSL I HYLLAPSARLBKSKLFALG 
N 




1126 


456 


TS KLJ KKTF 1 1 Gl SGVTNSGKT7LAKNLQKHLPNCS VI SODDFF 
K P ES E 1 ETD XNG FLQY D VLEALNMSKMN S A I S CWKES AR HS WS 
TDO AEE 1 P I LI 3 EGFLLFNYKPLDT1 WNRS YFLT1PYZECKK 
RRSTRVYQPPDSFGYFDGHVWPMYLKYROEMQD1TWEWYLCGT 
KSEEDLFLOVYEDLIOELAKOKCLOVTA*RRNTTNPS/CK*IRK 


5847 


2769 


SOS 


AP FMEDLS S PDSTLLQGGHNLLSSAS FOES VTFKDVI VDFTOEE 
WK0LDPG0RDLFRDVTLENYTKLVSIGLOVSKPDV1SOLE0GTE 
PW 1 MEPS 1 PVGTCADWETRLENSVSAPEPD3 SEEELSPEVI VEK 
HKRDDSWSSNLLESWEYEGSLERQQANOQTLPKE1KVTEKTIFS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRS1KQNSNPVK 
KEKSCKCNECGXAFSYCSALIRHORTH'TGEKPYKCN*//CVEKAF 
J: K S ENLI NHOR I HTGDKPY KCDQCGKGF1 EGPSLTQHOR IHTGE 
KPYK CDECG KA FS0RTHLVQHQR1 HTG EK P YTCNECGKA FSORG 
HFKEHOK1HTGEKPFKCDECDKTFTRSTHLTQK0K1HTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSOHSNLTO 
HQKTHTGEKPYDCAECGKSFSYWSSLAOHLK3HTGEKPYKCNEC 
GKAFSYCSSLTOHRRlHTREKPFECSECGKAFSYLSNbNOHOKT 
HTQEKAYECKECGKAFI RSSSLAKHER I HTGEKPYQCHECGKTF 
SYGSSLIOKRKIHTGERPYKCNECGRAFNONIHLTOHKR1HTGA 
KPYECA^CGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 
HLTQHQR 1 HTGE KP YKCNECDKAFSRSTHLTQHQR I HTGEKPYK 
CN E CG K \ TFSQ S TYLI QHQR I HSGEK PFGCNDCG KS FRYR S A^N 
KHQRbHPGI 


584B 


22 


296: 


AAPRRLLRGGDGDRTPRFPLPALLRPGPFAEAAPERRKMPAVSK 
GDGMRGLAVFISDIRNCKSKEAEIKR1KKELAN1RSKFKGDKAL 
DGYSKKKYVCKLLFIFLLGHDIDFGHMEAVNLLSSNRYTEKQIG 
YLF3SVLVNSKSEMRL1NNAIKNDLASRNPTFMGLALHCIASV 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 
VPMGDVIT S RWHLUIDOHLGWTAATSL 1 TTLAQ KNPSEFKTSV 
SLAVSRLS \RI VTSAETDLQDYTY * FCPG F LGLSVKLLRLLOCY 
PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
ISUIlHHDSEPNLLVl^CNQIjGQFLOHRETNLRYLALESKCTLA 
SSEFSHEAVKTHIETVINALKTERDVSVRQRAVDbLYAMCDRSN 
h?Ol VAEMLS YLETADYSIREEI VLKVA I LAEKYAVDYTW\Y\T) 
T 1 L»NL I R 1 AGDYV S EEVWYR VIQ I V I N RDD VQGY AAKTV FEALQ 
APACHENLVKVGGYILGEFGNLIAGDPRSSPLIOFHLLHSKFHL 
CS VPTRALLLSTYI KFVNLFPEVKPTI QDVLRSDSQLRNADVEL 
G^RAVEYLRLSTVASTDIIATVLESMPPFPERESSILAKLKKKK 
GPSTVTDLEDTKRDRSVTJVNGGPEPAPAS7SAVSTPSPSADLLG 
LG AAP PA PAG P P PS SGGSGLLVDVFS DS A S WAPLA PGS EDNFA 
RFVCKNNGVLFENOLLQIGLKSEFRQNLGRMFIFYGNKTSTOFL 
NFTPTLI CSDDLQPNLNI QTKPVDPTVEGGAQVOOVVNI ECVSD 
FTE A P VTjN I QFRYGGTFQNVS VQLP I TLN K FFQ PTE MASQDFFQ 
RWKOLSN PQQEVQN 1 FKAKHPMPTEVTKAK I IG FGSALLESVDP 

|MF>4»Nr VvAu IJ.H1AJ J yiot-ljJjnlJt.r'WI.'V^v'^* KU1 it'll J>AXJnv 

SQRLCELLSAQF 


5849 


3545 


1895 


KRR E 1 KET V FHH VAO AGLE LpLS S SN P PSS AS RS AG I TGMRHQVQ 
P*D?CMSLSPPCFTEEDRFSLEAL0TIHKQMDDDKDGG1EVEES 
DEF I REDMKYKDATNKHSHLHRBDKHI T J EDLWKRWKTS EVHNW 
TLE DTLQWM E F VELPQY EKNFRDNNVKGTTLPR I AVKEp S FM1 
S0LKISDRSHROKLQbKAlJ>VVLFGPLTT?PPHNWMKDFlIjTVSI 
VI G VGGCW FAYTQNKTS KEHVAKMMKDLES LQTAEQSLMDLQER 
L»B KAQE ENRNV AVEKQNL* R KMMDEI N YAK EEACRL»R ELREGAE 
CELSRROYAEQELEOVRMALKXAEKEFELRSSWSVPDALOKWLQ 
LTHEVEVQYYN J KRONAEMObAIAKDEAEKl KKXRSTA/FGTLHV 
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L=Leucine, K=Met;*iionine, NrAsparacme, 
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W*Tryptophan, Y-Ty r ° 9ine / X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHK3LEAKKALSELTTCLRERL?RWQQiEK3CGFQ 
3AHNSGLPSLTSSLYSDHSWVVMPRVS3PPYP3AGGVDDLDBDT 
PPIVSOFPGTMAKPPGSLARSSSLCRSRRSIVPSSPOP0RAOLA 
PHAPHPSHPRHPHHPOKTPHSLPSPDPDlLSVSSCPAliYRNEEE 
EEAIYFSAEK0WEVPPTASECDSLNSSJGRKQSPP/SKPRD3PN 
1 I S/DERYOEMRCP+ R3 PSGGIL 


St SO 


3 


1095 


KAVLN F S AS V I S LTGSNPMHDASM^ LKKNG I J VY LDVPLLN 
liiCRLKLMKTDRlVGONSGTSMKDLl.KFRROYYXKMYDARVFCE 
SGASPEEVADKVLNA1 KRYODVDSETFI STRHVWPEDCEQKVSA 
EFFIEAV1EGLASDCGLFVPAXEFPKLSCGEWKSLVGATYVERA 
01 LLERC3 UPADI PAARLGEM3 ETAYCENFACSK1 APVRHLSGN 
QFILELFHGPTGSFKDLSLQLMrHIFAOCIPPSCNYMILVATSG 
DTGSAVLNGFSKLNKNDKORIAWAFFPENGVSDFOKAQIIGSQ 
RENGW AVGVES DFDFCQT AI KR 1 FK DS D FTGFbTV E YGT I LSS A 
NSINWGRLLPOVVYHASAYLDLVSOGFI SFGSPVDVCI PTGNFG 
K 3 LAAVYAK1W5G 3P3RKFI CASNQN HVWTDF 1 KTG \H YDL.RGKE 
N*AOTFFfV0* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
0HH FQ I EKALVE KJUQODFVADWCSEGECI.AAINSTYNTSG Y ILD 
PHTAVAXWADRVODKTCPVIISSTAHYSKFAPAIMQALKIKEI 
NETS S SCLY LLG S YNALPPLHFlALLERTKOQEXKjEYOVCAADI^N 
VLKSHVEQLVONQFI 


SCSI 


3120 


1802 


rcy lioflallltstsaraaaai aaaeep ags ps vmtragdhnrc 
rgccgsladylitsakfllylghslstwgdrmwhfavsvflvely 
gks llltav yg l w ags v lvlga3 i gd w vdknarlkvaqts lw 
onvsvi lcgi i lmmvflhkhelltmyhgwvltscy1l1itian3 
anlastata 1 t 1 qrdw i wvagedrs klanmnat1 rr i dqltn 3 
lapmavgq3 mtfgspv3gcgfisghnlvskcveyvllwkvy0kt 
palavkaglkeeetelkolnlhkdtep:<plegthlmgvkdsnih 
elehe0eptcas0maepfrtfrdgwvsyyn0pvf/lgkhgscfp 
lydcpgl* lhhh r vrlhsgtewfhpo y fdgs 3 s ykwnngncs fy 
latskmwfgsdr sdlr3 gtaflfdlvcplci hawk ppglvrfsf 


5eb2 


1 


422 


KTrFPSSLCPI,RQLPEVRGYSGQPLTDPL3SLCRSHKCRGKGWG 
SSSYPSLPALLRARSAFGHCTHRSCGPEWRIDSISRLEKOGA^R 
S GWAQAQ PT I LL LV PRLR KS LPS 3 WG / S LMG F F 3 TSGPG/ W FRO 
YYFF3SGRH*VLFTECDFYY/VAMDFGGHGL9SKYSPGVPYYLQT 
FVSE3RRWAGKK0SVYFRRCGGCSRAPPL3TGGGVGSRKQRWP 
ESGAWALAPGLPA IHGRSWES 


50S3 


223 


1346 


R LI X5 LS R V K G LH G P AAS A W 3 SDPETK G D ? GG PWG M WRGS DLR PR 
PVSLTGLTLVCK * AAOGPQV\HSVKLCFGLGG\PCLL\FP3 FRP 
LLLK PRR PRLKPGTRGVAVEPHALR WH VAHGEEAG 1 RAAGPGH 
GGVE3P0G/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRR FP PD PALTCPGLGOD0GPRE0OKOGSGRHDT 3 LGDWGESE 
SRWVRGNFRTGTAATLIGFSRNPTLNGS ENWGSLVS 1 0EEGPDT 
GWBREXRNPAEMGNPORWASPIHrPPLGPEILRANPEALRAMPE 
ALGLRPDPATS V P SALS / QTF/ PES W PR S CLRNQGETLGMG PVP 
LSSLC3 TES PSQN WTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAP EX KGAALNNRENASS * NGY/S R WKQD1 R R I ENH 3 3 QE 

LXHLCAM 3 XR VLL ERLENTRKLRELTEG RTLDWP0NR 1 TEVSAK 

R0I VTEYREKGKRN* EEKKRDLBGR5KRY7^3>CI3G3 PETEDRAS 

GAETIKULLE/fcNr PE^KNEI^MjW&isAnKlFbKrWfcJv^ 

3 RVTFL/ KFQRRN 3 LQASSQRKOVTY KGAKVR LTS DFSPA3 LNA 

RRQW/N/PISRVLRENNFEPR3IYSAKLSFLYKGNWKTFLD3QG 

LGKYINOELSLKILLKDLLQLTENLN 


585S 


536 


2391 


LRSYGCKAPSRiSKLHKVFLFLLLPSLLMGYSESFPPITDSWAP 
F3SLTHHVLSQSCSPLSSNCW3 CLSTHT0* FTALPADLLTWTQS 
NVSLH3 SYLAI PFLADSFLKPV/L* PGNSAKHLSFKLSSLSMVS 
GRAVALLEL3ASGLTS3QTNTASSKPP3WGY\LSTOTSF3SPPP 
LCLSRTYPN?AH/\TMVG0VP0S LCGLI FTL/RTPCK PS 3 LHPNY 
K3 ISTSAWQKVLCFSGSPrXHTSLHLTTGSSFLSFKPl PGFPAA 
NSAL YVS SL KG P FG KNVT1 PS PVTGT * O P PH RGSN/ R LTVDKJDN 
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5=Serine, T = Threcr.ine , V^-Valine, j 
V- Tryptophan, Y=Tyxosine, X=Unknovm, *~Stop ! 
Codon, /-possible nucleotide deletion, 
^-possible nucleotide insertion) 





, 




} rL5PKFNSl,nCLPS0\TPY0ALTGAALAGSYPlWEN£NTLSWL " 
FTtt'YKFCL? yPSl»FFLCOTN*Y*XLPANWSGTCTLVFQAPTIN 
I : ,? PNQTIL I S VEAS I S SSP IRNKWALHLI TLLTGLG 1 TAALGT 
G 1 AG I ITS 1 7£ YGTLFT7LSNTVEDMKTSITSLQRQLDFLVGVJ 
LCNWRVLDUrrEKGGTClYLOEECCFCVNESGlVHlAVRRLHD 
KAAEL* HCVA^SWWQG£Stil.RWlPWVAPFLGPLI FLFLLLMIGP 
CI fTMLVSRFl SGRLNCFIOASMQKJ!IDNIFHLC}fV* YQSLRGNH 
SFAPEFRP 


5656 


17? 


1137 


PW'LHGLGLSAVFLFYL* /YVTFHLYGG1 ILLLLIFIS 1 AG1LYK 
FODVLLYFPF.^PSSSRLYVPMPTGIPHENIFIRTKDGtRLNLIb 
I ft YTGDNS PYS PTI 1 YFHG^GNIGHRLPNAiLKLWLKVNLLL 
\T)YRGYGKSEGEA£EEGLYLDSEAVLDYVMTSPDLDKTKIYLSG 
RSIGNGAAAj H1jASDNS>^RISA1MVENTF1jS1PHMASTL?SFFP 
MR YIiPLWCYKN KFLSYR K I SQCRM PS LFI SGLSDCL I P PVMKKQ 
L YELSPSRTKR LAI FPDGTHNDTW0C0GYFTALE0F3 KEWKSH 
SPEEKAKTSSNVT11 


5857 


155'/ 


" 563 


KL ) GKVLVbSVVAXAMAAFAVBPOGFALGSEPMMbGSPTSPKPG 
VNAQFLPGFL.^GDLPAPVTPQPRSISGPSVGVT-IEMRSPliLAGGS 
FP0PVVPAHKDKSGAPPVRS1YDDISSPGLGSTPLTSRROPN1S 
VNQS PLVGVTS T PGTGOSM FSPAS I GOPRKTTLS PAQLDP FYTQ 
GDf bTS EDH \ 1 DDSWGDC1 WGFLKAS A\S Y 1 blAQFAQYGGIS * 

DK^VMESSDRCALSSPSLAFTPPIKTLGTPTOPGSTFRISTMRP 
U-.TAYKASTSDY0VISUROTPKKDESLVSK7^)EYMFGW 


5058 


3Sb 


1419 


rjKOPAAASTcxViQQQQPPPPPQDSSKPWAQGPGPAPGVGSAP 
FA£SSAPPATPFTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
P5SGVPTTPPQAGGPPPPPAAVPGPGPGPKOGPGPGGPKGGKMP 
GGPKP GGGPGL S TPSGH PKP PHRGGGE PRGGRQHH P PYHQQHHQ 
GFl'PGGPGGRSLEKISGPRRGFKANLJil.ljRRPGEKTYTQRCRFC 
LLG 1 YLL»I SRVnNSRRLFAK I WENQEKFLSTKAKDSEF1 KLESR 
ALA*NC?KFELG *YTP*GGRQLPS£LFPTHACLPL>S CSV! FSPF 
MF PQ* NCWGKKP FRPNLGPHLKGAVCHRWDDPWEGPTGKGHCLN 
FAS 


5859 


3 07 


1503 


GG5S;^PRA£5KRMLSRKKTK3JEVSKPA£VOGKYVKKKTSPLLR 
NLISPS FI RHGPTZ PRRTD3. CLPDSSPNAFSTSGDGW5RNQSFL 
RTI lORTPhXINRRSSNRLSAPSYLARSLADVPREYGSSQSFVT 
EVS FAVENGDF( : S R Y Y YSDNFFDGQR KR PLGDRAH EDYRY YEYN 
HDLFO^PONOGRIUVSGI GR VAATSLGNLTNHGS EDLPLPPGWS 
VEKTMRGRKYY : DHNTNTTHWSHPLEREGLPPGMERVESSEFGT 
YY VDHTNKKACY\RHPCAPTCTS V* STCSCH I / AS / RQQTERJJQ 
SL LVPAl>IPy HTAJE 1 PDWLQVYARAPVKYDH I LKWELFQLADLDT 
YOcMLKLLFMKFLEQIVKNYEAYROALLTELENRKOROOWYAOO 
HGXNF 


5860 


2956 


1270 


T1RVEEF PLCPC- GG KAQLSS ASLU5AGI>LLQPPTP P PL.LLLLFP 
LIjI.FSR LCGALA.G P 1 1 VEPHVTAVWGKNVSLKCL 1 E VKETI TQ I 
SWEKI HGKSSQT VAVHHP0YG PSVQGEYQGRVLFKNY£I,NDA?I 
TLHNI GFSDSGKY I CKAVTFPLGNAQSSTTVTVLVEPTVSIilKG 
PD£LIDGGNETVAAZCIAATGKPVAHIDWEGDLGEMESTTTSFP 
NE7AT1IS0YK1FPTRFARGRRITCWKHPALEKDIRYSFILDT 

qyapevsvtgyegnwfvgrkgvnlkcnadanpppfksvv7srldg 
ok fdgllasdn7 lkfvhpltfn y sgvy i ckvt \ns pg s kevtqk 
vhptfqdpslptypplpal0fqwaspsta*tssd\latep*k1a 
pspi^tlXatikgktolptiia^csgvgalfivNlvkcfglgip 
cyrprrtfrgdyfaknyipfsdmqkesoidvlqodeldpypdsv 
kke^kn pvnnli k kdyleepe ktownnvenlnrferpmdy yedl 
kmgmkfvsdehydeneddlvshvdgsvisrrewyv 


5861 


205: 


; 1305 


EVCACVQAFWLVAESGDDSOGGDKCGCEVGSWGSMRWMARWi 
SEGE0GIPTACAAFAOQPAG/EPRRGLAGVGEGGPCCSWVKYRC 
TLH FLVSLLGTDLARGRGNSASGPTAPADSKQL/ML* DVHRRVI 
LE * RMNSGS PA* DNAPSQR FCTN1»S BGLRFGI S PSWREALYGCH 
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j Amino acid sequent containing signal peptide 
| (A-Alanine, OCysteme, D=Aspartic Acid, E= 
[Glutamic Acid, F=Phf-ny3alanine , G-Glycme, 
; H=H<stidine, 1 * Isoi oucine , K=Lysine, 
L- Leucine , M= Methi crime , N = Asps rag ine , 
P=Proline, 0?Giutair.ine, R-Argir. ine , 
S-Serine, T=Tht eon:ne , V=Valir.e, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








A 


b862 


ISM 


483 


FP FOb 3MGE3 K VS PDYNW FKGTVPLK K 1 1 VDDDDS Kl WS LY DAG 
PR S I RCPL I FLF P VSGTADV FFRQI LALTGWG YRVI ALOYP VYW 
CKLEFCDG FR KLLDKLQLD KVHLFGAS LGG FLAOKFAE YTH KSP 
RVHSLI LCN S FS DTE 3 FNOTKTANS FW LMP AF MLXX3 VLGNFSS 
GPVDPMMADAJDFMVDRLESI^QSELASRLTLNCQNSYVEPHKI 
RD3PVTIMDVFD0SA1STEAKEEMYKLYPNAPRAHLKTGGNFPY 
LCRSAEVNLY\^QIHL/R/RN<;mEPN7R?LTH0V?SVPRSLRCRKA 
ALASARRSSSVSLAVNDFXTRC^LV'SVASAPVSRPFPSGSSGS 
PVLTVSGK 


58 64 
586S 




24S 


PFPSRGSbFLAAPREDTMGPLMVLFGLLFLyPGLADSAPiJCPQN 
VNlSGGTFTLSHGWAPGSLl/rYSCPCMSLYPSPASRLCKSSGOWQ 
TPGA7RSLSKAVCKPVRCPAPVSFENGIYTPRLGSYPVGGNVSF 
ECEDGF I \LRGS PVRQCR PNG KWDGETAVCPNGAGHCPN PG I SL 
GPWRTGFRFGHGDKVKYRCSSNLVLTGSSERECOGNGVWSGTE 
PI CROP YSYDFPEDVAPALGTS f shmlgatxpjqktkeslgrkx 
QIQRSGHLNLYLLLDCSQSVSENDFL1 FKESASLMVDR1 FSF5I 
NVSVAI lTFASEPKVLMSVLNDNSRDMTEVlSSLENANYKDHSN 
GTGTNTYAALNSVYLMMNNQMRLLGMETMAW\QEIRHAI 3 LL\T 
DGK \ SH MGGSP KTAVDH I RE I LN 1 N0>'PN DY LD 3 Y Al GVGKbDV 
DWRF.I.N ELGSK KDG ERHAFI LQDTKAl.HQVFEHML.DVS KLTDTI 
CGVGNMSANASDQERTPWHVTIKPKSOET\C\RGALISDOWVLT 
AAHCFRDGNDHSLWRVNVGDPKS0WGKEFL1EKAVISPGFDVFA 
KKK-OG 3 L\EF Y GD\ D3 ALL \ K LAQKVKM\ S7HCQGPSCLP\ CTM 
\EA>fLGFLRETFKGSTCR\D}IENEL/VlVNKOSV\PAHF\VAL\N 
GSKLFHLTLRMGVEWTSCCRGIjSPKKKTM\FFNLT\DVRB\VVT 
D\OFL\ CS\GPOEDESP\CK«£\SGGA\VFLtKRKKLSAt;GVWC 
SWGL\YNP\O^SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
0*£ PWLRQHPGGMS * I FLPLLAMGKLSPFACPAR I CRPLKFLPS 
EKATLRTL 


17? 


1013 


PL1SVPQSLISLPQPLLCFPGGQEPSAPSPC1YSFLWACSF7MG 
KLPPSI PPSSPLACVLKNLKPLQLTPDLKPKCL3 FFCNTAWPQY 
KLDNDS X * PENG TFE FS 1 LO V LDNS CH KMGKWS £ V PD VQ A F F \ S 
HWSLPSLCSQC/GLIPNLSSF5PFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPFHTWS 
SI-OFHSVTSPPPPAQQFTLKKVAGAKGI VKV£APFSLSQ3R*RL 
GSFSSN1 K3QPSSWL3 WQQP 


668 


1684 


CLPGPRWGEGWRAGHTlVGC3FFKTAi:SHFKGGMYLCVCMCTC 
LSVCVCV0VGSW1CV/CVSNCACVSLCTC\1CRCISMYTREHAC 
ACTRV * V YMCKS / VCTCVSTCI DVRVCAHVCVYMCLCLG YA* AC 
TCV-MCVGMHEKVa-lC/VCACSCVLL/CRGH3 CM/MCMSAYI CI 
/ CV Y VC V LCWJACMRMSTC WO LVYG * ACTCVWMHM/ CSCTCR/C 
VHVCCMS MHACECLCVYLH 1 CGCAGTRRWV7AGS ARGSRSCSRLP 
CWAPGPG LSLPG PS CPSVEC>G LGGGPGOLGGRSGEARLGEHRGW 
GSPAAVCSRNCTVSPRRGADCFcAPDVPKOPPGWGRASFEERGC 
GGRGKVCAPPLNGPQCCCFS3 KPELKAKKXK 


5806 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDLDDL 
KKEVAMTEHKMSVEEVCRKYNTDCVOGLTHSKAQE1LARDGPNA 
LTP P PTT PEWVX FCROLFGGFS I LLW1GA I LCFLAYG I QAGTED 
DPSGDNLYLGI VLAAWI ITGCFS YYQEAXSSK1 MESFKNMVPQ 
QALV I REGEKMOVNA EE WVGDLV EI KGGDRV P ADLRI I S AHGC 
KVDNSSLTGESE PQTR S PDCTHE\NPLKTRKI TFFSNNFVEGTA 
RG WVATGDRTVMGR I ATLASGL.E VGKTP 1 AI EI EHFI QLI TGV 
AVFLC-VS FFI LSLILGYTWLEAVI FLIGI I VANVPEGLLATVTV 
CLTLTAKKHAKKNCLVKNLEAVETLGsrST 1 CSDKTGTLTQNRM 
TVAHHWFDNQI HEADTTEDOSGTS FDKSSHTtVVALF*H/LLGFC 
NK PVFKGGQDN1 PVLKRDVAGDASESALLKC1 ELSSGSVKLMRB 
RNKKVAE 1 P FNSTNK YQLS 1 HETEDPNDNRYLLVKKGAPER 1 LD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEFQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGI KVIMVTGDHPITAKAIAKGVG1 1 FEGMETVEDI AARL 
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SEQ I 
NO: 

■ 


Predicted 
beginnmc 
r.ucj eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rrecJictcd ere At.itio acid segment containing sic;:al peptide 
nucleotide i (A=Alanine» C=cysteine, D=AspartDc ^cic, 
duration : Glutamic Acid, F-Vhenyl alanine, C^Clycine, 
correspondinc 1 K=Histidine, I-Isoleucine. K=Lysinc 
to first i ;,H.eoc:ne ( M=Methioni rie , N-Asparac: nc , 
amino acid j P-Proline, C^Glutamine , R=Arginint , 
residue of ! S=Serine, T=Thrcorjdne , V=Vaiine, 
amino acid •. K=Tryptophan, Y-Tyrosine, X«=Unknown, »=£top 
sequence i Coaon, /^possible nucleotide deletion, 
1 \=pos5ible nucleotide insertion) 








lUPVSOVNPRDAKACVIHGTDLKDFTSEQIDEILQ^HTEIVFAR 
TS PQQKL I 1 VEGCQRQGAI VA VTGDG VNDSPAL. K KAD I GVAMG I 
UG SDV S KOAADM I LLDDN FAS I VTGVEEGRLI FEN LK KS I AYTL 
7SNIPE3TPFLLFIMANI PLPLGTITI LCI DLG7DKVPAI SLAY 
EAAESD1MKRQPRNFRTDKLVNEHLISMAYGQ1GM 7 QALGGFFS 
YFVILAZNGFLPGNLVGIRLNWDDR'I'VNDLEDSYGGOWTYEORK 
WEFTCHTAFFVSI VWOWADLI ICKTRRNSVFQOGMKNKILIF 
GLF3ETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
VDE1 RKL I LRRN PGGWVEKBT Y Y 


5867 


3 


1485 


LPGRRAKGGRGLGWPPAOALDGSKMGKAKVPASKRAPSSPVAKP 
GPV.KTLTRKKNKKKKRFWKSKAREVSKKPASGFGAV\T?P?KAPE 
DFS0NWKALQEWLLKQKSQAPEKPLVISQMGSKKKPKIIO0MKK 
ETS PQVKGEEMPAG KDQEASRGS VPSGS KMDR RAPV PRTKASGT 
EHNKKGTKERTNGD1VPERGDI EKKJKRKAK\GQPQPHPPR/IDI 
VSFDDVDPADIEAAIGPEAAKIARKOLGQSEGSVSLSLVKEQAFG 
Gl.TRALA LDCEMVG VGPXGEESMAAR VS 1 VNOYG K CV'YPK YVKP 
TFPVTDYRTAVSGIRPENLkOGEELEWOKEVAEKl KGRILVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRFSLRLLSEK 
:i.CLOV0OAE11CSTODA0AAMRLYVMVKKEWESK2VRDRRPLLTA 
FDHCSDDA*QSCPAAAAAPLOROCDOSCGOITSPOSGNSGETFS 
FSWQRGVAWCY 


5068 


2122 


C33 


Ltagashtodasqstsakypaaaonl/cvtnamredladi WYIR 

AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRAPSEPEDPV 
TFRSAFTERDAGSGLVTRLRERPALLVSSTSWTFDECFSILLAA 
I ESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIK0KHFOH 
1 OVCTPtfLEAEDY PLLLGSADLGVCLHTSSSGLDLPHKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFSDSEELAA01CMLFSNFP 
^FAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTSAK YPAAAQNL/ CVTNAMR EDI iADI WY IR 
AVTVYDKPASFFKETPLDLQHRLFKKLGSMHSPFRARSEPEDPV 
TEKSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV*T\MTLDGHNLPSLVCVTTCKGPLREYY5RI.1H0KHF0H 
10VCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAOI^QMLFSNFP 
FPAGKLNOFRKNLRESQOLRWDESVfVQTVLPLVMDT 


5870 


2122 


833 


LTAG AS HTODASQSTSAKYPAAAQNL/ CVTNAMR EDI..ADIWY I R 
AVTVYDKPASFFKETPLDLOHRLFMKLGSMHSPFRARSEPEDPV 
7 ERSAFTERDAGSGLVTRLRERPALLVSSTSKTEDEDFSI LLAA 
:.ESRV^T\MTLDGHNLPSLVcVITGKGPLREYYSRLIiiQKHFOH 
•OVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLFMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEEIAAOLQMLFSNFP 
rPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKJRS 
VLKLL* LSLRRL*LEPT1*NGLLT*CSRLSVFRFLKV\GSVYEP 
i.KSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGG DO KAK I QDSL YCAAGAWALALAYRR I DDDKGRTI iELEHSA I 

K OlRG I lycymroadkvwfkqdprpttclhsvfnvhtgdells 

Y EE YGHLQ I NAVSLY LLYLVEMI SSGLO 1 1 YNTDE VS FIQNLVF 

cv\ervyrvp\dfg\vwgkregkyy*/sgstelhsssvglgkro 

1 * KOFNGFNLFGNQGCSWSVI FVDI jpAHjfRNRQTLCS LLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKWRKLKGKYGFKR 
FLRDGYRTSLEDFNRCYYKPAEIKLFDGIECEFPIFFLYMMIDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPVVPKYYYVPADFVEYE 
KKN PGSQKR FPSNCGRDGKLFLWGOALY 1 1 AKLLADE L I S PKDI 
DPVQR Y VPLKDQRNVSMRFSNQG PLENDLWHVAL 1 A ESQRXQV 
FL.MTYG 1 0T0TPC0VEP IQI WpQOELVXAYLQLG J NE KLGLSGR 
FCRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYWS0WFLL1D 
D: KKALOFIXQYWKMHGRPLFLVLIREDNIRGSRFNP ILDMIiAA 
LKKGII GG VKVHVDRLOTLISGAWEOLDFLR I SDTEELPEFKS 
F LELEPPKHS KvyRQS STPSAPELGOQPDVN I SEWKD KPTHEIL 
C-KLNDCSCLASQAILLG JLLKREGPNF1 TKEGTVSDH1 ERVYRR 
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SEQ 
IT) 
NO: 


Predicted 
bee inning 
nucleotide 
location 
ccrresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor res por. dine 
to first 
amino acid 
residue of 
amino acid 
sequence 


/jnmo acid ^eoment containing signal peptide 
(AaAianine, OCysteine, D-Aspartic Acid, 
Gi uremic Acid, F= Phc-nyl^iar.ane, G^Glycine, 
K=Histidine, J = 2so:eucine. K=i./sine, 
L- Leucine, ^Methi onme , N--Asparagine , 
P-Proline, Q=Glutarr.: ne , x-.Arginine, 
S=Serine, T=Threoninc , V-Va2ine, 
W- Tryptophan, Y=Tyrc$ine, X=Unknown, *=Stop 
Ccdon, /=pogsible nucleotide deletion, 
^possible nucleotide insertion) 








P.GSOKLWSWRKAASLLSKWDSLAPSITIA'LVOGKQVTLGAFG" 
X. EE EVI SNPLS ;'RVI QN I I Y VKCKTHCF K EAVI QQE LV I HI GWI 
JSNNPELFSGTLKIRJGWI 1HAME VEI.QJ RGGDKPALDLYQLSP 
SEVKQLLLDILQPCQNGRO^LNRROIDGSLNRTPTGFYDRVWQI 
LERTPNGII VAGKHLPQOPTlSDMTMYEMNFSLLVEDTIjGMIDQ 
F0YRQ1WELLMWS1VLERNPELEFODKVDLDRLVKBAFNEFQ 
KDQ S R LKE I E KCDDKTS F YNTP PLG KRG TCS Y LTKAVMNLLLEG 
EVKPNNDDPCLIS 


5872 


68 


661; 


VUGYMYRFV1KINSCYSEKTS3CHHRCCPELPAT0PWPTPTVFF 
I^AIDSESLGCIXSFKLFADXV/PKRWKKNFVLLNTGEXVLGDK 
GPCFYRlIPG\LCOGGDfTH}WGTC.-GKSLYSKEFDDENFI/bKK 
1 APCVLSTANAG PTTNGSQFF 1 CTAKTEDG ♦ QHWFGKVKDGM5 
1 VEALERSGSRNGKTSKKI TAANCGOb 


5873 


2240 


506 


RRFPEGGSGGGRRTRARMPl^PWSliALPijLljSWVAGGFGNAASAR 
HHGIiLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
Gfc'CVGPNKCRCFPGyTGKTCSCDVK'ECGMKPRPCOHRCVNTHGS 
Y KCFCLSGHM,MPDATCVNSRTCAM I NCCYSCEDTEEG PQCLCP 
ESGLRLAPNGRDCLDIDECASGKV2 CPYNRRCVNTFGS YYCKCH 
3GFEL0YISGnyDCin?NECTMDSHTCSHHANCFNTQGSFXCKC 
KCGYKGNGLRCSAI PENSVKEVLRAPGT3 KDRJ XXLLAHXNSMK 
KXA K 3 KNVTP E PTRTPTP K VN 1>QP KN Y EEI VSRGGN SHGG\ K KG 
NEEKMKEGLEDEKREEKAbKD*HRRERPFRG\DVFPPKVNEAGE 
FGL2 L\V0RKALTSKLEHKADLN3 SVDCSFNHG\ICDW\KQDR\ 
E DC F D W \ N PAD K \ DN A 1 \G F Y \MA V PG I . WOGK K \ XD I GR LKLLL 
PDLOPOSNFCLLFDYRI^GDXVGKLRVFVKNSNNALAWEKTTSE 
DEKKKTGXIQLYOGTDATKS I IFEAERGKGKTGEIAVDGVLbVS 
GLCPDSL1>SVDD 


5874 


2 


3387 


ACPR I ARR RR K VRS LRR RRG W LRAR WSKGONNMAARR I TOET FD 
AVL0EKAKRYHNDASGEAV5ETLOFKA0DLLRAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLR.SDVFSGPSFRSSNPSISDD 
SYrRKECGRDLEKSHSNSRDOVIGHRKl^HERSQDWKFALRGSW 
FQDFGHP VSQEE S WSQEYS FG PS AV LGDFGSSRL I EKECLEKE \ 
S^DYDVDHSG\EA\DSVLRGS\SOVOA\RGRALNIVDQEGSLLG 
. KGETOGLLTAKGGVGKLVTLRNVS1 KKI PTVNR1TPKTQGTNQI 
OW-rrPSPDVTLGTNPGTED j QFP1CK1 PLGLPLKNLRLPRRKMS 
FD13 DK SDVFSR FGIEI 1 K W AG FHT 1 KDD 1 KFSQLFQTL.F E1>ET 
ETCAKMLA£FKCSbKPEHRl;FCFFTlKFbKHSAI>XTPRVDNEFL 
NK1U.DKGAVKTKNCFFEII KPFDKY J MRL.QDRLLKSVTPLLMAC 
KAYEbSYXMKTLSNPLDLALALE'nKSLCRKSLALLGOTFSLAS 
S?R0EK3L*AVGl^DIAPSPAAFPNFEDSTLFGREYIDHLKAWl3 
VSSGCPLOVKKAEPEPMREEEXM1FPTXPEI0AKAPSSLSDAVP 
ORADHR WGTl EQLVKR VI EGSLS PKERTLLKEDPAYVJFUSDEN 
SLLYKYYKbKI^EMO^SENLRGADOKPTSAlX^AVRAr'lLYSRAV 
RNLKKKLLP\ WORRGLLRAOG \ LRG\ WKAR RA\TTGTQTLLFLR 
APGLKHHGROAPGLS\QAKPSbPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVD3SEAPQTSSPCPSADIDMKDNGRTAE 
KLAR FVAQ VG \ PE 1 EQP\S1 \ENSTDNPDLWFL\HDONSS\AFK 
FY\HKKVFELCPSICFTSSPHNL\KTGGGDTT\GSOESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPSLEGSTPADGLPGEA\AEDDL/AbGAPALFTGLLOVTCFPFG 
RGFSSKSLKVGM 3 PAP XR VCL I QE PKVHEP VR I AYDR PRG3PKS 
X KXX PKDI>DFAQOKL\ TDK\ NI>GF0\ MLOKMGWKEGHGLGSLGK 
G2R\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQI* 
1FV? 


' 5875 " 


296 


1846 


LAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LE FSGS LFPHA I CLGDVDNDTLNEL WGDTSGKVSVY KNDDSR P 
WLTCSCQGMIiTCVGVGDVCKKGKNLLVAYSAEGWFHLFDLTPAK 
VLDASGHKETLI GEEQRPVF KQH 1 P ANTKV'MbI SDIDGDGCREI* 
WGYTDRVVRAFRWEELGEGPEHLTGQLVSLKKWMLEGQVjDSLS 
VTLGPLGLPELMVSQPGCAY A3 LLCTWKKDTGS PPASEGPTDGS 
/SGDPS CPRRGA^PDl WPYPQOECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
scouence 


Predicted pnd 
nucleotide 
location 
corresponding 
to first 
nmino acid 
residue of 
amino acid 
sequence 


Amino ccac tecment containing signal peptide" 
(A^Alar.ine, OCycteine, D=Aspartic Acid, E = 
Glutamic Acid, F=Phenylalanine, G=Glycme, 
H=Histidane, 1 ^1 sol eucine, K=Lysine, 
L=l>eucine, X=Met.h2oni ne, NrAsparagine , 
P=Froline, 0=Glut amine, R=Arginine, 
S^Serine, T=Threor.ine , V= Valine, 
WrTryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLFAI>CTLDGTLKLMEEMEEADKLLWSVQVDHQLFAX»EKL>DVTG 
HGHEEWRCAWDGCTY i 1 OHNRTWRF0VDENIRAFCAGLYACK 
EGRNSPCbYYVTFNQKIYVYWEVQLERMESTNLVKLLETKPVST , 
TACCRSWAWILTTSL*LVPCFTKRSTI0TSHHSVXP0ASR3 PPS 1 
*?TCLIAGEGFF*TPTLPPKCVFGSHCAAAGSITXQ 1 


5B76 


1122 


224 


HLPLGVPS KVAG AAANE P QEERJETQVAAWLXXI FGDH P 1 ?QY EY | 
KPRTTE1 L/HHLS ERNR VRDRDVYLVI EDLXQXASEYESEAK Y 
DLLMESVNFSPANLSSTG5RYLNALVDSAVALETKDTSLASF1P 
AVWDLTSDLFRTKSKSEEIKIELEKLEKNLTATLVLEKCLOEDV 
XK*E1,HLSTER\AKVDNRRQNM\DFLXAXSEEFRFGJQAAGEQL 
SARG0\ DAFSVP 1 QSLVALIRENWPRLXQQTI PLK\ KXliESYLD 

lmpVkpshcsk^rieeakVrelaXsieaeltrrvsNmmel 


5877 


2030 


1907 


GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLiEV 
LSRELI EXLAI SRNCKLLOAGEENOVLELLIHRDGEKOELMKLA 
LNOGKI HHENOVLEKEVE KRDSDI OOLQK0LKEAEOI LATAVnf 0 
AKE KL K S 3 E KAR K GA 1 S S E E 1 1 K Y AH R 1 S ASNA VCAP LTW V PGD 
PR R P Y ? TDLEMR S GL LGOMNN PSTNGVNGHLPGDALA / RR X I AR 
CPCSTVS/NGSOMTCR*INIILILQKSVCEL 


5678 


950 


211? 


GLWKCM0LOGPKTHRVQP* PTPRQQGPQ\VPVAVIAGNRPNY1jY 
RMLRSL»LSAOGVSPOMlTVFIDGYYEEPMDWALFGLRGlC?HTF 
IS 1 KNA^VSgHYKASLTATFPTLFPEAKFAWLEEDLDlAVDFFS 
FLS0S I HLI.EF.DDSLYC1 SAWNUQGYEHTAEDPALLYRVETMPG 
LCWV1.RRSLYKEKLEPKW PTPEKLWDWDMWMRMPEQRRGRECI 1 j 
pnv c;o q yhpc: i vm .mmnc; y thray FKKHKFNTVPGVCLRN VDSL 

xkeayevevhrllseaevldhsknpcedsflpdteghtyvafir 
mekdddfttwt03uakclh 3 wdldvrgnhrglvtrlfrkknhflw 
gvpaspysvxkppsvtpiflepppxeegapgapeqt 


5879 


3 


981 


rlteaaaagsgsraagwagspptllplsptsprcaatmassded I 

GTNGGASEAGEDREAPGKRRRtGFLATAVfLTFYDIAMTAGWLVb 
A3 AWVRFYME KGTHRGLY KS I OKTLKFFQnTFALLE I VHCL1 G I V 
PTSV1 VTGVOVSSR I FMVWMTHSI XPIQNEESVVLFLVAWTVT 
E1TRYSFYTFSLLDHLFYF1 XWARYNFFI 1 LYPVGVAGELLT I Y 
AALPHVKKTGMFS IRLPNKYNVSFDYYYFLL1TMASYI PLFPOL 
YFHWLRQRRKVLHG\G*L*XRMIK*SLOTRCFF0NNODYIjSPSF 
NNXNXQliCEISVU VWFLK1 


5880 


1138 


1324 


SLWCLVAGGLGLG PSSCN PLQRAG I LA^PREARGTFS ALTACS A 
S VTS KGKSSSGMW PSAAS DR DS PVPLR PPGPVQLPSGTGW \TbSD 
♦XXKRGRCSS/WLSQPCHEREXEVVXLRRSMAEGERARAASDVI. 
CRSLANETHOLRR TLTATAKMCOHIAKCLDERQHAORNVG ERS P 
DOSEHTDGHTS VQS V I EKLOE ENRLLK0KVTHVEDLNAKW0R Y*N 
ASRDEYVRGLHAQLRGLOIPHEPELMRKEISRLNRQLEEKINDC 
A2VKOElJVA5RTA^DAALERV0^E00IIAYKDDFMSERADRER 
AOSR10ELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 
ADALELWVPGGKRPGTGS0CPEPPAEGGHPGAA0RG0GDL0CPH 
CL0CFSDEOGEELLRHVAECCQ 


5881 


26 


441 


GGI^PSPTEAPRAOHLTMDCTW.RILFLVAAATGTHAQVOLLQSG 
SE VXK PGAS VMVS CYVSG YTLTKLSMHWVRQAPGKGLE* MGPFD 
LQDVETIYP0KFOGRVSMTEETSTETTQ/AYIjEI>SSLRSEDTAV 
HHCATDTV 


5882 


I 2407 


2216 


sgcsn£klyshsleynpekisvqsavapaqlalnsdgdu*lhsge 
rtrrd * qlp sagg pglqep u0u5elditsdefilde vdg\vdlr 
hyskqvelelqql eoxs 1 rdy i qeseni aslhnq1tacdavler 
meqmlgafqsdlss 1 sseirtlqeosgamnirlrnrqavrgklg 
elvdglwpsalvtaileapvteprfleolqeldakaaavreqe 
argtaacapv^gvldrlrvkavtkirefilokiysfrxpmtnyq 
i potallk yrffy0fllgneratake3 rdeyvetlsxi ylsy yr 
syu5rlmkv0yeevaekpdlmgvecrtakkgffsxpslrsrntlf 
tlgtrgsv1spteleapilvphta0rge0rypfealfrs0hyal 
ldns cr e y i.fi cef fwsgpaahdlfhavmgrtlsmtlkjilds y 

LAIXTYmiAVFIXlHIVLRFRNIAAXRDVPALDRYWEOVLALLW 
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SEC 

ir 

NO: 


Predicted 
beginning 
nucleot ide 
locat -on 
correspondanc 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca 1 1 or. 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino ocid seoment containing sion^- peptide 
(A=Aionine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F- Phenyl alanine, G=Giycine, 
H^HiStidine, l^lsoleucine , K^Lysinc. 
LsLeucinc, M=Met hionine , N=Aeparocir>c, 
P^Prolinc, 0- Glut amine, R^Argdnint, 
S-Senne, T^Tnreonine, V~Valine, 
W=Tryptophan, Y -Tyrosine, X=UnKnowri. • = Ktop 
Codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion) 








PRFELI^EKNVOSVRSTDPQRLGGLDTRPIJVITRRVAEFSSALV 
S I NOT 1 y N ERTMO^LGQLQVE VEK FVLH VAAE FS S S KEQLV FLI 
NNYDMMLGVLMX E * ERAADDS KEVES FQQLLNARTCEF1 EELLS 
PPFGGLVAFVKEAEALIERGQAEKLRGEEARVTOLI.RGFGSSWK 
SSVES LSODVMR S FTN FRNGTS 1 1 0G ALTQLI 0 \ :• I R FHR V\ L 
S0PQLRALPARAELIN1HHLMVELKKHKPNF 


5883 


2 


1374 


E FPG R R FRA E AG AG AG AG AAG W£ C PG PG PT VT71 G S Y EAS EG 
CERKKGORWGS1.ERRGM0AMEGEVLLPALYEEEEEEKEEEEEVE 
EEEEOVOKGGSVGSLSVKKHRGLSLTETELEELRACVLQLVAElr 
EETRELAGQHEDDS LELOGLLEDERLASACXJAEVFTK QI QQLQG 
ELRSLP.EEISLLEHEXESELKEIEOELHLAOAEJOSLROAAEDS 
ATEHESD I ASLQEDLCRMQNEbEDMERIRGDYEME I ASLRAEME 
MKSSEPSGSLGLSDYSGLQEELQFXRERYHFLNEEYRALQESNS 
SLTGOLADLE S ERTOR-^TERWLOSOTLSMTSAESOTS SMDFLE P 
DPEMQLLRQQLRDAEEOMHGMKNKCOELCCELEELOKHRQVSEE 
EORRLOREl.KCAONEVLRFQTSHSNSPSHPLPPIPPSSPCLL^A 
LW1SALLWCWWAETSS 


5884 


4261 


252^' 


GVbARAS/vRLRVPLTGVRACAEPEVGAE PAKVAGA^.E PDEDGGR 
S RLRDCGD YTPS L'R W5P KGAMLWFQGA1 PAA I ATA K R SGAVF VV 
FVAGDDEQ STQMAAS WE DDKVTEAS SN S FVA I K I DT K S EACLQF 
SQI YPWCVPSSFFlGDSGlPLEVI AGSVSADEbVTR I HKVRQM 
HLLXSETSVANGSOSESSVSTPSASFFPNNTCENSO-^RNAELCE 
1PSTSDTKSDTA'JX ; GESAGHATSSQEFSGCSDQRPAEDLMIRVE 
RL.TKKI .EERREEKKKEEEQRE1 KKEI ERRKTGKEMLDYKRKQEE 
BLTXRMLE ERN3 EKAEDRAARER I KQQ1 ALDRAERAARFAKTXE 
EVEAAKAAALLAKQAEMEVKRES YARERSTVAR J QFRLPDGSSF 
TNQFPSDAPLEEAROFAAQTVGNTYGNFS LATMFPR h fcFTK ED Y 
KKKbLDLELAPS AS WLL»? /AI>F INF * AGRPTAS 1 VKS SSGD3 W 
TLLGTVLY PFLA J WRLISNFLFSNPPPTOTSVRVTS^ EPPMPAS 
SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGWSTQOK 


5885 


900 


467 


AAGGGRRSRLSRSWPTGPSK£PSGVRCCG\RR\AWEEKDEFLDV 
I YWFRQ3 LAW jLGVI WGVLPLRGFLG I AGFCLINAGVbYliYFSN 
YL01DEEEYGGTWEt.TKEGFWT£FA/IVHGHbDHLLHCHPL*LM 
VYS SOVLP 3 QS KG P£ 


5886 


8fe 

1 


i3U 


PFRGRALTLKKO PRPGVAP PSLGTCH KSDPGR PAAOSC-P P S PGS 
GTFGbLSFRMVRTKTWTLKKHFVGYPTNSDFELKTBELPPLXNG 
EVbLEALFLTVDPY^VAAKRLKEGDTMMGQQVAKVVI.SKNVAL 
PKGTI VLAS PG WTTHSI SDGKDLEKLLTEWPDTI PUiLALGTVG 
MPGLTAY FGLLE I CGVXGGETVMVNAAAGAVGSWGQ I AKLKGC 
KWGAVGSDEKX'AYLOKbGFDWFNYKTVESLEETLKKASPDGY 
DCYPDNVGGEFSNTV3GQMKKFGR1AICGAISTYNRTGPLPPGP 
PPElGIYOELRMEAFWYRWOGDARQKAliKDLbKWVLEbPYFVI 
D* LOANTLVYKSMKS AKPSbEYI SEKbVSG\K 1QY KEYII EGFE 
NMPAAFMGMLKGDNLGKTIVKA 


S887 


1S37 


104 


APGCRGtRATRCPCRGPRWDSLGDEAARSPAAPGGAPGLLGLRE 
RPDRCKPGGDDRGPOLHRGSPG/SPSELSRRPGPPGLPGIiQGPP 
PAPG1,PQSRTL/PVLCVCDLSPAQCD1NCCCPFDCSS\T5FSVFS 
ACSVPWTGDSOFCSOKAVIYSLNFTANPPQRVFELVDOINPSI 
FCI HI TN\ * NLHY PbUlOKYL/NENNFDTLKKTSDG FTLNAES Y 
VSFTTKLDIPTAAKYEYGVPLQTSDSFLRFPSSLTSSLCTDNNP 
AAFLVN0AVKCTRKINLE0CEEIEALSMAFYSSPB3LRVPDSRK 
KVPITVQS I VIQSLNKTLTRREDTDVLOPTLVNAGHFSLCVNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSSVWPLQQKFEIHFLQ 
ENTOPVPLSGNPGYVVGLPLAAGFQPHKGSGIIOTTNRyGOLTI 
LHSTTEODCLAbEGVRTPVXFGYTMOSGCKLRLTGALPCOLVAQ 
KVKS LLK'GCG FPD YVAPFGNSQGP / ADMLDWVP IHF I TQS FNRK 
DSCOLPGALVIEVKVfTKYGSLLNPOAKlVNVTANLlSSSFPEAN 
SGNERTI hi S TA VTFVDVS APAEAG FRAPPA INARLP FN F FPPF 
V 



394 



6NSDOCID: <WO 01533l2A1_L> 



WO 01/53312 



PCT/USOO/34763 



SEQ 1 
ID 
NO: 


Predicted 
beg i nr. inc 
nucJeotidf- 
locatior. 
corresponding 1 
to first 1 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ecid sccnent containing sicnai peptide 
;A=Aicjmne, C=Cysteine, D=AKpartic Acic, E- 
Glutamic Acid, F=Phenylalanine . G=Glycme, 
H-Hi stid.ine, 3 = I soleucine, K-»Lyci:ic, 
L-L-eucine, M-Methicnine, N^Aeparagarit , 
P=Proline, Q=GJutamine, R=Arginine , 
S=Ser:nc, T=Threonine, V=Valine, 
W-Trypt ophan, Y=Tyrosine, X=Unknovjn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion) 




375 


2302 


IJ.CRTPC-V^ORADSEOPSKRPRCDDSPRTPSNTPSAEADWSPG 
LELHPDYKTWGFECVCSFLRBGGFEEPVLLKNIR^NEITGALLP 
CLDESR FENLGVSSLGERKKLLS Y IQRLVQI HVDTtfKV INDP1H 
CKIELHPLLVRI1DTPQFQRLRYIKQU3CGYYVFPGASKNRFEH 
S L«G V G YLAG C LVKALG E KQPELQ I S ERDVTjCVQ I AG LCHDLGHG 

pfshmfdgrfiplarpevkwtkeqgsvmmfehlinsngikpvme 
qygli peedicf3keqi vgplespvedslwpykgrfenksflye 
3 vsk k:^ng 1 dvdkwdy fardchhliglqnnfdy kr fi kfarvcev 
dn e lr i car d ke vgnl ydmfhtrns lhrrayqh kvgn 3 1 dtmit 
daflkaddy i ei tgaggkkyristaiddmeaytkltdni fleil 
ystdpklkdareilk01eyrnlfkyvgetqptgoik3kredyer 
lpkevasakpkvlldvklkabdfivdvinmdygmoekkpidhvs 
fycktapnrairitkn0vs0l»t.p\ekfaeo\llrvyckkvdrks 
lya\aroyfvow\cadr\nft\kpqdgrcy*pptp*kfokkgw\ 
npstfspki ptrlprrlpksrv\olfjodpm 


j 5889 




731 


LPAACGRPVTAR PRCAPEGRSGRPRDL3PYPPQ VFPPR PDR VAI 
VTGGTDG3 GYSTAKHLARLGMHVI I AGNNDSKAKQW5K 1 KEET 
I^DKET* VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FG 1 FI L\DLASMTS I RQFV0KFKMKKI PLHVL 1 NNAGVMMVPQR 
KTR DG FEEH FGLNY UGHFLLTNLLLDTLKESGS PGH S AR WT VS 
SATHYVAELIs'KDDLOSSACYSPHAAYAQSKLALVLFTYHLORLL 
APJiGSHVTANWUPGVVNTDLYKHVFWATRLAKKLbGWLl,FKTP 

Ur,vi><V» ] oil AMV 1 rtbtwVwKI Ij J WAIVE, J ivo L>i. V l J riV*^l-»V%A< 

LWSKSCEMTGVLDVTX 


5R90 


1271 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLOSGTEAACRS 
GRPDPRPASAJ^GGHAGERMSQRDTbVHLFAGGCGGTVGAlLTCP 
LEWKTRLOSSSVTLYISEVOUTTMAGASVNRWSPGPLHCLXV 
ILEXEGPRSLFRGLGFNLVGVAPSRAI YTAAYSNCKEKJjNDVFD 
PDSTQVHM1 SAAMAGFTAI TATNPIWLI KTRLQL.* / SOGTAGKR 
RMGAFECVR KVYQTDGLKG FYRGMSAS YAG3 SETV IKFV1 YESI 
KOKLLEYKTA£TMENDEESVKEASDFVGMMLAAATSK\LVATTI 

LVRQJ P\NTAlMMATYELWYIibNG 


5891 


132 2 


200 


FRR0WSAAGRAVPVAFCSR1SASSPRRPRGAVRL0SGTEAACRS 
GRPDI RPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAI 1/TCP 
LEVrvXTRLOSSSVTLYlSEVQLNTMAGASVNRWSPGPLHCLKV 
I LE KEG PRSLFRGLG PNI»VGVA PSRA I Y FAAYS NCKEK LNDVFD 
PDS10\^M1SAAMAGFTA1TATNPIWL3KTRLQL* / SOGTAGKR 
RMGAFECVR KVYQTDGLKG FYRGMSAS YAG I SET V I H FV1 YESI 
KOKXLE Y KTASTMENDEESVKEASDFVGMMLAAATSK\ LVATTI 
AY PHI W RTR LREEGTK YR S FFQTLSLLVOE EG Y G £ LY RGLTTH 
LVRQI P \NTA 1 MMATY ELWYLLNG 


58S2 


1764 


379 


WLRVCGR LS VNSA VSS RTGGWS AGLTCAMORLOWLGHLRGPA 
DSG m P0 AAPCLSGAPHASAADWWHGRRTAI CRAGRGGFKDT 
TPDELLS AVMTAVLKPVNLR PEQLGDI CVGNVLQPGAGA I MAR I 
AQFLSD I PErVPLSTVNROCSSGLOAVAS I AGG I RNGSY DI GMA 
CGVESMS LADRGNPGNI TSRLMEKEKARDCL1 PMG 3 TSENVAER 
FGISREKQDTFALASQOKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRSITVTODEGIRPSTTMEGLAKLKPAFKKJJGSTTAGNSSQVS 
DGAAA I LLAR RS KAEEI X5LP3 LG VLRSYAWGVPPD I MG 3 GPAY 
AI PVALQKAGLTVSDVDI FEINE\AFASQAAYCVEKLRLPP* EG 
*TPU^ASGP*GHPIXLHWGP^0VIXLA0*S*SARGKRAYRSGC 
PCAIGSWNGSPLPVFEYPWGT 


5893 


2 


1653 


ILSKRRCOKAKTKEI^AKKVAVlGAGVSGLISbKCCVDEGLEPT 
CFER TED 3 GGVWR FKE.WEDGRAS I YQSWTNTS KEMS CFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFOTTVLSVRKCP 
DFSSSGCWKWT0SNGKEQSAVFDAVMVCSGHH1 LPKI PLKSFP 
GMERFKG0YFHSRQYKHPDGFEGKR1LV3GMGNLGSDIAVELSK 
KAAOVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSKLRNVLPR 
TAVKWtf I EQOMKRWFNHENYGLEPQUKY IMKEPVLNDDV P SR1.L 
CGAI KVKSTVKELTETSAI FEDGTVEENIDVI I FATGYSFS FPF 
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SEQ 
ID 
NO: 


Predict ed 
beaar.nino 
nucleot idt 
local x on 
corresponding 
to f\rM. 
arr.ino acid 
residue oi 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
cirnino acid 
residue of 
amino acid 
sequence 


Ami r.o acid segment containing r j?nal peptide 
!A=A"lnnine, C*Cysteine, D=Asp«-rtic Acid, E= 
Glutamic Acid, F=Phenyl alanine ( G=Glycine, 
H^Hjstidine, l=Isoleucine, X = L.ysint, 
L« Leucine, /^Methionine, N=Asr.aragine, 
P--?rolme, Q=Glutaroine, R=Ar9imnfc, 
SsSerine, T^Threonine, V=Volir.c. 
W^Tryptophan, Y«Tyrosine, X = Ur:known, 4 =Stop 
Codon, /=possible nucleotide deletion, 
Vpossibie nucleotide insertion' 








LEDS I.VKVENWVSLYKYIFPAHLDKSTLACIGL1QPLGS1FPT 
AELOARWVTRVFKGLCSLPSERTKMMDI I KUNEKR 1 DLFGESQS 
OTLCTNYVDYLDELALEIGAKPDFCSLLFKDPKLAVRLYFGPCN 
SY * VRLVGPGQWEGARNAIFT0K0RILKP1.KTRAJLKDSSNFSVS 
FLLK2 LGL1AVWAFF\CQLQWS 


5854 


11 A 


1673 


RYS ?KKVLCN KESSLKLGMATALVSAflSI JiPLNLKKEGLRVVRE 
DH YS TWEOG F XLQGN5 KGLGQ. E P LCKQFR C » >R Y EETTG PREALS 
K LR SbCOOWLQPETHTKEH 1 LELLV1 jEQFL j 1 1 .PKELQARVQEH 
KPE£.^EDVWVLEDLQLDLGETGQQVDPD<;;PXK0K1LVEEMAPI» 
KGVOEQQVRHECEVTKPEKEKCEETRIENGKLIVVTDSCGRVES 
SGK1SEPMEAHNEGSNLERHCAXPKEKIEYKCGERE0RFI0HLD 
L3 EHASTHTGKKIXESDVCQSSSLTGHKKVLS* ERKVIQC\HGV 
LGKA FQRS SHL.VRHQKI HLGE K P Y OCNECG KV FSQNAGLLEHLR 
1 HTGEKPYLCI HCGKNFRR5SKLNRHQR3 HSQEEPCBCKEOGKT 
FSOALLLTHHQRIHSHSKSHQCNECGKAFSLTSDblRHHRIHTG 
EKPFXCN3 COK-^FRLNSHLAOHVRI HNEEK PYOCSECGEAFRQR 
SGLFOHORYHHKBKLA 


5895 




86 


KPSLLGAIPFYPPFSSPWPPPLYLFWNSKRKSRHFINQRGIHGE 
KRLFVSrx5\TGCbFVLAAAGRARGRAEVL2S?VGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAlCRYFFMiLSGWEODDLTNOWLEW 
EATE^OPTLSAALYYIi\WQGKKG\EDVLG5VRRTLTHlDHSLS 
R0\NCPFLAGETESLADIVLWGALYPLLODPAYLPEELSALHSW 
F0Tl,.STQ\EPCQR\AARRLVLKQ\OGVL l ALR\PYLOXQPOPSPA 
EGKGLS P I E PE EEELATLS EEE I AMAVTAW F KG LES LP PIiRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGKI 1 GCVLSADVFARYS 
RLR QWNTL Y LCGTDEYGTATETKAL\ EEG LT PQE 1 CDKYH IIHA 
DIY\RWFNISFDIFGRTTTP00\TKIT\ODl FQQLLKRGFVbQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDOCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSOliLFLDLPK-.EKRL.EEWlXSRTL 
PC-SDWTPNAQF1TPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK \ VFYVWFDATI G Y LSI TAN YTDQWE U WW \ KN PEQVDLYO 
FM\AKDMVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKS RGVGVFRDM\AHDTGI PPDISRFYL\ LY I RPEGK\DSA 
F£ WTI)LLJ. KNNS\ ELLNNLGN F I NRA \GMF VS K FFGG\ Y VPEMV 
L»T PDD0R1»I*A\ >1VTLELQ11 Y13Q\ LLEKVR 1 R D AbR S I LT I S \ RH 
GNQ Y I \ QVNE P W\ KR I KGSEADRQRAGTVTG LA VN I AALLS VML 
Q P Y M P TVS AT I QAQLQLP P P ACS 1 LLTNFLC 7 L P AGHQ1 G TVS P 
LF0K1 ,ENDQ 1 ESLRQRFGGGOAKTS PKPAW ETVTTAKPOQI QA 
LMDE VTKQGNl VRELKAQ3CADKNE VAA2VA K LLDLKKQLAVAEG 
KPPEAPKGKKKK 


5B96 




" 8€ " 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCXPVLAAAGRARGPJU^VLISTVGPEDCVVPFIjT 
RPKVFVL01>DSGNYLFSTSAICRYFF\LLSGWEODDLTKCWLEW 
EATELQPT1^AALYYL\VVQGKXG\ EDVbGS VRRTbTHl DHSLS 
RONNCPFT^GETESLADIVLWGALYPLLODFAYLPEELSALHSW 
FQTLSTO\EPCQR\AARRLVLKQ\OGVLALF\PYL0K0PQPSPA 
BGKGLS P1EPEEEELATLSEEEI AMAVTAWE KGLES LPPLR PQQ 
NPVLP VAGERNVLI TSALPYVNNVPHLGNI 3 GCVLS ADVFARYS 
RLROWNTLYLCGTDEYGTATETKAL\EEGLT PQE I CDKYHI IHA 
DI Y\RW FNI SFDI FGRTTTPQQ\TK1T\QD1 FQQbbKRGFVLQD 
T VEQLR CEn CAR F \ LADR FVEG VC P FCG Y EEAK G Uyt~L>lvCGKIiJ 

navelkkpqckvcrscpwqssqhlfldlpktekrl5ewlgrtl 
pgsdwtpnaqfitpffgfrewpskprwo*tri:lk\wgnpgtp*e 
gfedk \vfyvwfdatigyls i tawytdqwer ww\ knpeqvdlyq 

FM\AKDN\^PFHSLVFPSSALGAEDNYTL\VSHLIATEYIjNYEDG 
K\ FSKS RGVGVFRDM\ AHDTG 1 PPD 1 SRFYlALY 1RPEGK\DSA 
FSWTDLLLKIWS\EI^NNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQKYHQ\ LLEKVR I RDALR S I LTI S \ RH 
GNOY I \ QVNEP W\ KR I KGS EADRORAGTVTGLAVN 1 AALLS Vlib 
QPYMPTVSATIQAQLQLPPPACS3LLTNFLCTLPAGH0IGTVSP 
LFOKLENDQI ES LRQRFGGGQAKTS PKPA WET VTTAKPQQ I QA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
tc first 
arr.ino acic 
residue of 
arrino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segment containing signal peptide 
(A=AluniJie, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F=?henylaian:ne , G=Glycine, 
H=Histidine, 1 = Isoleucine . K=Lysine, 
7,= i^-uci ne, M=Methionine, N=Asparacine, 
PsProline, Q=Glutamine, R=Arg:nine, 
S-Serine, T^Threonine, V»valme, 
V)c Tryptophan, Y=Tyrosine, X=\?nknown, * = Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








LMDEVTKOGNIVRELKAQKADKNEVAAEVAKLLDIjKKOLAVAEG 
KPPEAPKGXKKK 


5897 


2967 


86 


HPS LLC A I PFYPPPSSPWPPPLYLFWNSHRXSRWFINQRGlhGE 
^LFVSDGVPGCLPVXAAAGRARGRAEVLl STVGPEDCWPFLT 
RPKVPVLOLDSGNYbFSTSAJCRYFFNLLSGWEODDLTNOWLEW 
EATEbCPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHlD>lSLS 
ROXNCFFLAGETESliADIVLWGALYPLLODPAYLPEELSALKSW 
FQTUSTQ\EPCCR\AARRLVLKO\0SVIALR\PYLOKQP0PSPA 
EGXGLf PIEPEEEELATLSEEEIAMAVTAWEXGLESLPPLRPQO 
NPVLPVAGE3WVL1 TS ALPY VNNV PHLGN 1 1 GCVLSADVFARYS 
RLROWNTLYLCGTDEYCTATETKALVEEGLTPOEI CDKY:-!I II1A 
DI Y \RV3FN1 SFD1FGRTTTPQQ\TK1T\0BI FOOLLXRGFVliQD 
TVEOLRCEKCARF\LADRFVEGVCPFCGYEEARGDOCDKCGKLI 
NAVSLKKPOCKVCRSCPWOSSOHLFLDLPKIjEKRLEEWIjGRTL 

pgs2wtpnaqfitpffgfrewpskprwq*trdlk\wgnpgtp»e 
gffok\vfyvwfdatigylsitanytdovjerww\knpeovdlyq 
fm\akdnvpfhslvfpssalgaednytl\vshliateylnyedg 
x\fsksrgvgvfrdm\ahdtg1ppd3srfyl\ly1rpegk\dsa 
fswtdlli,xkns\ellnnlgnfinra\gmfv£kffgg\yvpemv 
ltpddcr1»la\hvtlelqhyhq\l.lekvr i rdajlrs ilt 1 s\rh 

GNQYI \QVTJEPW\KRI KGSEADR0RAGTVTGLAVN3AALLSVML 
OPYHPTVSATIOAQLO^PPPACSILLTNFLCTLPAGHQIGTVSP 
LFOKLEND0TESLRQRFGGGOAKTSPKPAWETVTTAKPOO10A 
l.MDl=:VTKQGNTVRRLKAQKAnKNEVAAF.VAKT.I,rJLKKQLAVAEG 
KPPEAPKGKKKK 


S89B 


2967 


86 


hps:»l,gaipfypppsspwppplylf'a'N£hrksrkfjno;rgihge 
mr lrvsdgvpgclpvlaaagrargraevll stvgpedcwpflt 
rpkvpvloldsgnylfstsalcryffvllsgweqddiitnowlew 

EATELOPTLSAALYYIi\VVOGKKG\EDVbGSVKRTLTHIDHSbS 
ROXNCPFl^GETESLADIVLWGALYPLI-ODPAYLPEELSALHSW 
FQTIiSTO\EPCQR\AARRLVLKQ\OGVLALR\PYLQKQP0PSPA 
SGKGl >SP I EPEEEELATLSEEEIAMA VTA WEKGLESLPPLRPOC 
NPVLPV/iGERNVLl TSALPYVNNVPHLGNl IGCVLSADVFARYS 
R LRQWNTLYLCGTDEYGTATETKAlA EEGLTPOE1 CDKYH 1 1HA 
D I Y \ R WFN I S FDI FGRTTTPQQ\TKI T\QD1 FQOLLKRGFVLOD 
TVEOLRCEHCARF\lJa)RFVEGVCPFa.-YEEAi?GDC)CDKCGKLI 
NAVEbKKPOCXVCRSCPVVOSSOHLFLDLPKLEKRLBEWLGRTL 
PGSDWTPNAQFITPFFGFREKPSKPRW0*TRDLK\KGIMPGTP*E 
GFEDK\VFYVKFDAT1GYLS1TANYTDCWERWW\KNPEQVDLYQ 
FMXAKDNVPFHSLVFPSSALGAEDNYTLNVSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTG1PPDISRFYL\LY3RPEGK\DSA 
FSWTDI^LKNNS\ELLNNLGNPINRA\GNF'VSKFFGG\YVPEMV 
bTPDfX?RLLA\HVTLELOH YHQ\LLEKVR 3 RPALRS 1 1/T I S\RH 
GNOY I \Q VNEPW\KJR I XGSEADRQRAGTVTGLAVNI AALLS VKX 
QP YM PTVS AT I QAQLQLPPPACS I LbTN FbCTLPAGHOl GT VSP 
LFQKLENDQ I ESLRQRFGGGQAKTS P KPAWETVTTAKPO01 OA 
l/MDEVTKOGN I VR ELXAQ KADKNE VAAE VA X 1»L»DL.K K Ql jA V AEG 
KPPEAPKGXKKK 


5895 


326 


1078 


KCPKS KE PNGVRAPSLPSPU^AAMALSDVDVKKQIKKMMArl EQ 
EANE XAE E 1 DAKAEEEFNI EKGRLVOTOR LKi MEYYEKJCEKQI E 
O^KJ^ILTCSTMRifQAJU/KVLRARNDLI SDLLSEAXIiRTjSRIVEDP 

evyqg lld k l vlqg llrllep vm i vr cr p \ odlllveaavq kai 
peyhtisokhvev\01dkea*lavecswewjevtsgriorlkvsn 
tlesrldlsakokmpeirmalfgantwkff: 


5900 


64 


I4D9 


KAASRDSPCLEFCPLCGVSSHDLQHRMWYHRLSHIjHSRLODLLK 
GGV I Y P ALPQPNFKSLLPLAVHWHHTAS KS LTCAWQQHEDHFEL 
KYANTWR FDYVWLRDHCR S AS CYNS KTHORSliDTASVDliCl KP 

ktirldettlfftwpdghvtkydlnwlvknsyegokokvjopri 
lwnae i yoqaq vps vbcqs fletneglkkflqn fuuy g 1 afven 
vpptoehteklaerisliretiygrmwyftsdfsrgdtaytkla 

LDRHTDTT Y FQE PCG I QVFHCXjKHEGTGGRTLLVDG F Y AAEQVLi 
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SEO 1 
ID | 

NO: 

1 
1 
1 
1 

1 


Predicted 
beginning 
nucleotide 
3 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predictea end 
nucleotide 
locatior. 
corresponding 
to first 
amino acic 
res-idue o: 
amino acid 
sequence 


Amino acid segment containing saqnal peptidl* 
(A=Alanine, C=Cysleine , D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M= Methionine , N=Ascaragine, 
?=Proline, Q=GlutaTrune , R=Ara:nme, 
S-Serine, T- Threonine, V^Valinc, 
W= Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








OKAPEEFEL1.SKSA1\KHEYIEDVGECHOPHDWDWAOS*ISTHG 
/yKJBLYblRYMNVDRAVlNTVPYD\AmRWYTAHKTLTIEl J PRPE 
WEFWVKLKPGRVI.FIDM.JRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


5901 




212: 


VAIEQTSLKM,yOAVGGAFARPTGEYICNOCGAKYTSLDSFOTHL 
KTHLDTVL? KLTCPQCNKE FPNQESLLKHVTI HFMITSTY Y ICE 
SCDKOFTSVDDLOKHU.DMHTFVFFRCTLCOEVKDSKVSIOLHL 
\ AVKHSNEK K V Y H CTS CNKDFRN ETDLQ LWKHN'HLENOGXVHK 
CIFCGESFGTEVEI.QCHITTHSKKYNCKFCSKAFHAI ILLEKHL 
REKHCVFETKTFNCGTNGASEOVOKEEVEL0TLLTNS0SSHNSII 
CGSEEDVDTSEPKYGCDlCGAAYTy/ETLLONI-lOLRDHNIRPGES 
A I V KKKAEL 1 KGN Y KCNVCS RT FFSENGLREHMCTHLG P VKF:YM 
CP I CGERFPS LLTLTEH KVTHS KSLDTGNCR I CKMPLQSEEEFL 
EHCCMHPDLRNSLTGFRCWCMQTVTSTLELK1HGTFHMQKTGN 
GSAVOTTGRGQHVQKUYKCASCUKEF-RSKODLVKLDINGLPYGL 
CAGCVKLSKSAS FG 2 NV P PGTNR PGLGON ENLSA I EG KGKVGGL 
KTRCS*LATFKF* VLKVELPEPHPKPFHRGVSRPDSNS TQLKTP 
QV£ PMP RI S ? SQS DEKKTYOC I KCQWV FYN EWDI Q VH VANHK 1 D 
EGLNHECKLCEQTFDSPAKLQCHLIEHS FEGMGGTFKCPVCFTV 
FVQANKLO0H 1 FSW1GQEDK I Y D CTQC PQ K FF FQTE LQNHTMTQ 
HSS 


S902 


712 


20V 


lknrrrsrpsirosigstsvsrwltslftyldhtadvq*v*ref 
j pi/xprq* ed * mfoswlhawgdtleeafeocamakfgymtdtgt 
vepujtvevetqgddlosllfhfldewlykfsadeffipVgwge 

EFS LSKK PQG TE V KA I TY S AMQVY NEEN PE V FVI I D I 


5^03 


2106 


73 i 


DTPGPSLPSTTAPFSLRSl*SFPSRP5YLLPGDPQPLQGRGbPTT 
PALFALSAVPGGAASPMPPSGljRbLPLLLPLLWLLVLTPGRPAA 
GLSTCKT3 DMELVKRKR I EA1 RGQI LSKLRLASPPSOGEVPPGP 
LPEAVLALYNSTR DR VAG ES AE P EPE PEADYY AKEVTRVLMVET 
HKEIYDKFKQSTHSIYMFFNTSELREAVPEPVLLSRAELRLIjRL 

klkveohvelyokysnnswrylsnrllapsdspewlsfdvtgw 
rowlsrggeiegfrlsahcscdsrdntlovdingfttgrxrgdl 
at i hgmnrpfi .llkatpi,2kaqhlqs \srhr0al\dtny\cfs f 
p:ggrnclrc/vhc^hlifrkdl\gw\kwi\he\pkgyhanfc\l 
gpcpyiwsldtqyskvlalyno\hkpg\asaap\ccvp0alep\ 
lpivyy\vgrk?kveqlsnmivrsckcs 


S904 


3 


1126 


MKEEIENA1HTFKEEQRLI YEELIKEEKTTNNELSAISRKIDTW 
ALGNSETEKA FRA 3 SSKVPVDKVTPSTLPEEVLDFEKFLQQTGG 
ROGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTODEVQQ 
HEKWYQKFLALEERKKES I QI WKTKKQQKREEI F KJLK E KADNTP 
VLFHNKOEDNOKOK EE0RKX0KLAVEAWK0KS 1 EMSMKCASOIi 
KEEEEKEKKHOKERORQFKLKlLLESYrQOKKEQEEFLRLEKEI 
REKAEKAEKRKNAADE I S R FOERDLH KLELK 1 LDROAKEDEKSQ 
K QRRLAKLXE KVENNVS R D PSRLY/NTH QRLGRTNQKDRTNRLW 
AT5TYPT*GYSNLETRNTEK£Mk 


5905 


287 


291>* 


MAS FPPRVNEKEIX'RLRTIGELbAPAAPFDKKCGRENWTVAFAP 
DGS Y FAWSQG H RT VK LVP W S QCLQN F LLHGTKNVTNSS S LR1»?R 
0NSDGGQKNKPREH1IDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGIiNSGRIKIWDVYTGiaLLNLrVDHTGWRDL 
TFAPDGSLI LVSAS RDKTLRVWDLRDDGN\MMKVLRGHONVJVY\ 
SCAFSPDSSKLCS VG AS KA WAA I L V * LR LCKHH SHTS ATM VLS 
WAERVASLATGLGATFT1G* SNLAFVLOGVLYVHRCWSMSTFCF 
SFFLFFFFKVI SPTVKYH * LLSKLI FQFYGIGSLTSETNLM* SI 
WLSNGFSVLFFGI LSDSRDI LRU* FNLKFVL1 FF * K* CI VS VQK 
KKKPKRIALL0EERLS*DKPPSSHL1*0TEVNIRZ1jFRAII>HS* 
bLIFRI*NC3*TYS*IIDPFYIQKTYDRG*FGKNKMVKF*FIEM 
* LYYFHKIAFSFCNW*KPCCLPKKFHLAVNlliFACSICFSS*A 
OVGDPSLL'TSDYLKGRCOWSNNLLTLRFLSVYFFKNLWSGXK 
R EGGL * YLTLF 3 S V Y FS * bVFG I NGFOY S FWKLHCLYFMFRLI 
FKLTFN RNI * NR I CMSALI NLKTDFNLTMTLS I FFKLLI I YNA * 
YNLN* I +QF* YKMCHFVL»CMSE*SYN J CLFIAGF\LWNMDXYTM 
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SEC 1 
3D 
NO: 


Predicted 
beginning 
micleot ide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r g 5>pond i nc 
to first 
amino acid 
residue of 
amino acic 
sequence 


Anino acid segment containing signal peptide 1 
(;.-Alanine, C=Cysteine, D=Aspartic Acid, E« | 
(;] ut amic Acid, :-=Pheny3 alanine , G^Glycine, ( 
V.~ Histidine, J = I solcucine K — Lysine, 
L=Leucine, KsMethicnine, N=Asparagine , 
P=Prcline, Q^Glutamine, R=Arginine, 
S* Serine, T-Threonine, v* Valine, 

Tryptophan, Y=Tyrosine, X=Unknoum, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








: R K LEG HH K D VV ACD FS P DG AL LiATAS Y DT R V Y I WDPHNGD1 LM 
EFGHLFPPPTPI FAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FKR I DEDY PVQVAPLSNGLCCAFSTDGSV1AAGTHDGSVYFWAT 
PR Q VPSLQHbCRMS 3 RRVMFT0EV0ELP 3 PSXLLEFLS YR 1 


590b 


146 


203S 


RE^AGSGRMASGA\YNPYI El 1 EQPRQRGMRFRYXCEGRSAGSI 
PGEHSTDNNRTYPSI0IMWYYGKGXV\R3TLVTK\NDPYKPHPH 

dlvgkdcrd\gyyeaefgqe\hrp\lffon\lgircvkkkevke 

A\ UTR\3 KAGINPFDVP* KQLKD 1 EDCDLDWRLW FR VFLPDG 

hgnl\ttalp?v\vsspiydnrapntaelrvcrvnkncgsvrgg 
de i fllcdxvqxdd3 evrfvlndweakg3 fsqadvhrqva1vfk 
tfpycxaitepvtvknqlrrpsdoevsesmufrylpdekdtygn 
kakkokttlilfqklccdhvetgfrhvdcdglelltsgdpptlas 
qs ag i tvn fperpr pgllgs icegry fkke pnl fsi 1dawremp 
tgvssqaesyypspgp1ssglskhasmaplpssswssvahptpr 
sgktnplss fstrtlpsnsqg 3 pp fl.r i pvgndlnasnac i ynn 
addivgmeassmpsadlygisdpnm^sncsvnmmttssdsmget 
dnpr'-^lsmnlenpscnsvlidprdlrolhomssssmsagansntt 
vfvsosdafegsdfscadnsmjnesgpsnstnpnshvfvqdsoy 
sg :gsmqneqlsdsfpyeffqv 

tyllsswss • * nldtx 1 ksqvkv/rxghl<x3swpypcpaxc;i*gk 
katsxvpsaphfvhpndhanreaelkkkwveemrekoqaareqe 
rokrrtiesycodvlrroeefehkeevloelnmfpqlddeatrk 

AYYKE FRKWE Y S DV 3 LEVLDAR D PLG CR C FQME E A VLRAQGNK 

klvlvlnkidlvpkevvekwluylrnelptvafkastohovknl 
nrcsvpvd0asesllkskacfgaenlmrvlgnycrix3evrthir 

VGWGLPhJVGKSSIilNSLKRSRACSVGAVPGlTKFMOEVYLDKF 
IRI.LDAPGIVPGPNSEVGTILRJWHVOKLADPVTPVETILORC 
NLE E 1 S NYYGVS G FQTT EH FLTA V AH R LGKKXXGGLYSQEQAAK 
AVLADWVSCjK IbFY I PPPATH J LP t HLoAbJ VKbMi fcvrl/i c.*Jl 

ECANEDTMECLATGESDELLGDTDPLEMEIKLLHSPMTK1ADAI 
FJOKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
SVtRRSVLOR 3 NETDPLQOG0ALASALKNKXXNOKRADK3 AS KL 
SDFMMSAIjDLSGNADDGVGD 


5907 


95 


1873 


5908 


247 


975 


HCC- 3 KKRGEGSGSPS PASGGFQLGC01 P2PSLPS EEETHPHTRA 
HTR TLRATLTRR PPR S HSTRLR FP MPLDGDGGlJiSWK/ PMRER* 
GWFK PAKAAGASLGVAATGKRGCRMSKRYL0KATKGKLL3 J IFI 
VTLV<GKWSSAN3mKA33HVKTGTCEWALHRCCNKNKIEERSQT 
VKCS CFPG0VAGTTRAAPSCVDAS3VE0XWWCHM0PCLEGEECK 
VLFr-RXGWSCSSGNKVKTTRVTH 


5909 


1 


5002 


PA 3 PGS7I 1 WA PG S H 5 AARADG « HG S LPS QSQA P GALCGARA PP 
S SN L RADR SMI CAQARAGKN LYHNR FIjGjLAAMAFP SRN S QS LRR 
CKE F I RYS YNPDQ FH NMDLRGG FK3DG VTI PRSTSDTDLVTS DSR 
STLMGRSS YYS 1GHSQDLVIHWDI XEEVDAGDW JGMYLI DEVLS 
ENFLDYKNRGVNGSHRGQ1IWKIDASSYFV5PETKICFKYYHGV 
SGALRATTPS V'lVKNSAAP I FKS I GADETVQGOGSRRL I SFS LS 
DFQAMGLKXGMFFNPDPYLKISIOPGKHSIFPALPHHGOERRSK 
I IGTvrrVWIWQAEOFSFVSLPTDVLElEVKDKFAKSRPI IKRFL 
GKLSM P VQRLLERHA I G DRWS Y TLGRR LPTDHVSGQLQ FRFE I 
TSS 3 HPDDEEI SLSTEPESAQI ODSPMNNLMESGSGEPRSEAPE 
SSZS WXPEQLGEGSVPDRPGNOS 1 ELSRPAEEAAVITEAGDQGM 
VSVG PEGAGELLAOVQKDI Q PAP S AEELAEQLDLGBEAS ALLLE 
DGE7J>ASTKEEPLEEEATTQSRAGREEBEKEQEEEGDVSTLEQG 
EGRLOLRASVXRKSRPCSLPVSELETVIASACGDPETPRTHY2R 
1 HTLliHSMPSAOGGSAAEEEDGAEEESTLKDSSEKIDGLS EVDTV 
AAD p S ALEEDR EEPEG ATPGTAH P G HSGGH FPS LANG AA QDGDT 
HPS7GSESDSS PRQGGDHSCEGCDASCCS PSCYSSSCYSTSCYS 
SSCySASCYS?SCYWGNRFASHTRFSSVDSAKlSESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDEIAAPSGHVER 
SPEGLESPVAGPSNRREGECP31>HNSQPVSQL?SLRPEHHHYPT 
I DE ?LPPN WE AR 1 DS1 JGRVP YVDKV^TTTWQR PT AAAT PDGKR 
RSGSI0QMEQLNRRYQNIQRT1ATEHSEEDSGSOSCE0APAGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
1 oca t a or. 
corresponding 
to fzret 
amino ocia 
residue oi 
amino acid 
sequence 


Amino acid sesn.ent containing signal peptide - " 
(A-Aianine, C=Cysteine, D=-Aspartic Acid, E- 
Glctamic Acid, F=Pbenylalanine , G=Glycine, 
H=H^stidine } 1* 1 soleucinc , K=bysine, 
L»LEucine, M- Methionine, N=Asparagine , 
P«=Proline, Q=Glutamine , R=Arginine, 
S=Serine, ?=Threonine, V=valine, 
W^Tryptophan, Y=Tyrosine, X-Unknovn, * =Stop 
Codon, /=poss\bje nucleotide deletion, 
\-possxble nucleotide insertion) 


j 






GGGGSDSEABSSQSSLDLRREGSLSPVNSQKlTLbLQSPAVKFl 
TN PE FFTVLHAN YS^.y RVFTS STCLKHM I LK\T4RDARNFER YQH 
NRDLVNFINMFADTRLELPRGWEJKTDOOGKSFFVDHNSRArTF 
IDPRlPbQmjRLPNKbTHRGHLQRLRSYSAGEASEVSRNRGASb 
LAR PG HS LVAAI R S jHQHES bPLAYNDK I V AFbRQ PN I FEMLQE 
ROPSLARNHTLREKIHYlRTHGNKGLEKIiSCDADLVlLLSLFEE 
E I MS YVP LOAAFH PG Y S FS P RCS PCS S PONS PGLQ RAS ARAPS P 
YRRDFEAKLRNFYRKI.EAKGFGCGPGKIKLI1RRDHLLEGTFN0 1 
VMAYSRKELORNKLYVTFVGSEGLDYSGPSREFFFLLSQELFNP 
YYGLFEYSANDTyr\?02SPMSAFVENHLEWFRFSGRILG\LALI 
H0YLLDAFFT\RPFyKALL\RLPC\D\LSDLEYLt>KEFHOSLOW i 
MKDNNI TDI LDLTFTVNEEVFGQVTERELKSGGANTQVTEKWKK ' 
EY3 ERMVXWRVERGWCOTEAbVRGFYEWDSRLVSVFDARELE 1 
bVIAGTAEIDLNDWRtWTEYRGGYHDGULVIRWFWAAVERFNNE | 
QRLRLLOFVTGTS S V PY EGFAAPPWEPMGLRR FLP 4 KKWGKI TS : 
LPPRG\HTCLOPuWDbPTVSPRTPMLYEK\LLTA\VEETSTFGT 


5910 


1526 


446 


VAEFAAMEPGRTQIKbDPRY77\DbbEVbKTNYGIPSACFSQP?T 
AAObbRALGPVEbAL TS 1 LTbbAbGS I AI FLEDAVYLY KNTbCP 
I KRRTLLWKSSAPTWS VLCCFGLKI PRSbVLVEMTI TSFYAVC 
FYbLMLVMVEGFGGKEAVbRTbRDTPMr4VHTGPCCCCCPCCPRL 
bbTRKKbQ\R* CWALSNTPS * R * R* PWWACFSSPTASMTQQTFL 
RGACbYGSTbSSA/CSTILAbWTbGIISRQARLHbGEONMGAKF 
AbFQVbLIbTALQPSI FSVLANGGQ1ACSPPYSSKTRSQVMNCH 
LL3 LETFLMTVI/TRMY YRRKIWKVGYETFSSPDbDLNLKALRWM 
AWTMKGCCTH 


5911 


109 


5 91 


QbPlAPCIQGKGbEMRSPKPQSFlIRSSHSGAGbbVKNPSTPVF 
CGHRRGGAAFKYKPTPWGPEORPTGOKWMRGGVSLLSPRbECS 
GTISAHCNLRbPSSSKSPAPAS* bAGITGVCHHAObl FVFbVET 
GFHHVGQAGbELL/NVVIHbPRPPKVLGbQA 


5912 


924 


277 


M I bNXALMbGAbAbTT VMS PCGGEDIVADHVASYGVNLYQS YG P 
SGOYSHEFDGDEEFYVDbERKETVWQLPbFRRFRRFDPOFAbTN 
lAVbXHNbNIVIKKSNSTAATNEVPEVTVFSKSPVTLGQPNTbl 
CLVDNlFPPWNlTWbS^GKSVTEGVSETRPSSPKSDHFI.bQDO 
VTS PSFPFE * * Dlt* TAKVEOLGAWFEPbbXHWGAEI PTTL 


5513 


46 


1396 


ObRWHGAEGAAGR0SEL£PVVSLVI)VbEEDEEL.ENEACAVbGGS 
DSEKCSYSOGSVTCROAbYACSTCTPEGEEPAGICIACSYECHOS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYKDN 
FFGLYCI CKRPYPDPEDEI PDEMIOCWCBDWFHGRHLGAI PPE 
SGDFQEKVCQACMKRCS FLWAYAAOLAVTK I ST\GMMDNCGTLM 
E* /DDQEVJ KPENGEHODSrLKEDVPEQGKDDVREVKVEONSEP 
CAGSSSESDLOTVFKNESbNAESKSGCKbQEbKAKOblKKDTAT 
YWPbNWRSKLCTCQDCMKMYGDbDVLPLTDEYDTVLAYENKGKI 
A0ATDRSDPLMDTLSSMWRV0OVEL1C/GIQ* FED 


5914 


960 


124 


NbGGSEbPPEEALFIOVASMNQRRVDFYLASIEDMLVAl/GGRN 
ENGAbSS VETYSPKTDS WS YVAGLPRFTYGHAGTI YKDFVYI SG 
GHDYOIGPYRKNbLCYDKRTDWIEERRPMTTARGWHSMCSLGDS 
I YS IGGSDDI* I ESMERFDVLGVEAYSPQCNOWTRVAPbbHANSE 
SGVAWJEGRI Y I LGG YS WENTAFSKTVQV YDREADKWSRGVDLP 

IWiWAiinLf AMr *>1A»V*C I *\ ft-K i^/vJW\r(vj» 1KJ \jni> Lstrj 

PHRHLPGLCRPAATS 


5915 


1604 


703 


FPGRPTRPliKLGRRRKRARIIOAPHCUSPRPRTCPPGALOAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPKEPPGLPR PLGSFP CPT 
PQEDFP ALGG PCP PRMP PS PGFSAWLLKGTP P P P PPGLV P P TS 
KPPPG?SGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PREbPGKEPSAHPVHOGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL* *AAGPAAH \ 


. 5916 


256 


633 


SPRMWEIMGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
TVTGAVH RHbNHV AG 1 1 PWVLHSQLKPTAATAQDQWTSQQYPDH 
FXRblbO* NQATADKNN * TTALLQPH QRlA VS PRMAEA 


5917 


1343 


827 


AHQI LTYbEP/ 1 CLWNYWKILTVFLTKSVLE1 * KF1HTPQTYR 
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f SEQ 
, ID 
NO: 

! 

| 


Predicted 
beginnxnc 
nucleotide 
1 oca t ion 
corresponding 
to first 
eirnino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
i oca t y on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seoment containing siqnsl peptide 
(A^Alanme, OCysteine, D^Aspartic Acic, E = 
Glutamic Acic, Ft- Phenylalanine, G^Glycine . 
H=Kist idine, I - Isoleucine, K^Lysinc-, 
L-Leucine, M=Methionine , N-Aapararjinr. , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Scr;ne, T=Threcnine, V= Valine, 
W=Tryptut.n5n, Y=Tyrosine, X= Unknown , *=stop 
Codor., /^possible nucleotide deletion, 
Vtrpossible nucleotide insertion) 




1 

i 


?*NnFFGIKEVyVSRSLRKTSP/RlAV?PLEOAWSKECVPVDQ 
PMEHLLPSLLSLASDPVPNVRVI.LAKALRQMLLBKAVFRNAGNP 
HLEV] F,E7 3 LALQS DRDQDVSFFAALEPXRRNI 3 DTAVLEXQtf 


5918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRG PP6RPR PI*? 
PGPARRGR RRMETP FYGDE ALSGLGGGASGSGGTFAS PGR LFPG 
A?PrAAAGSMMKKDALTLSLSEQVAAALKPAPAPASYPPA\ADG 
APS AAP PDGI .LA5PDLGLLKLAS PELERLI I QSNGLVTTTPTSS 
QFLY P KVAAS EEQE FAEGFV KALEDLH KQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPCELAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\ VAFAAEPVPF PPPPP PGALGPRR? /RLALQGRR PQTV 
PDVP\SFGESP\PLSP1ET\DTPRRI\KAXRKRL\RNPQ3RAPK 
P ASRKLG AOS RAL ER ES EDPS * S PEHGS LASTASLLK EOVAOLK 
QKVLSHVNSGCQLLPQHQVPAY 


3919 


1 


4254 


TSVOGDSOGTPTSSOGSINMEHWISQAIHGSTTSTTSSSSTQSG 
GSGAAHRLADVMAOTH 1 ENHSAPPDVTTYTSEHS 1 OVER POGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAK1QQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAOMLAMRGEOLGWTNW 
PPS LE AALQR WGT1 SP KAPCLTTWDTNG K PLY I LTYG K LWTRSM 
KV AY S 1 LH KLG TKOEPM VRPGDR VALV P P NNDP AAFM AA FY G CL 
LAEWPVP 1 EVPLTRKDAGSQQ1GFLLGSCGVTVALTSDACHKG 
LPKS PFGE3 PQFKGWPKLLWFVTES KHLS KPPRDWF\ PHIKDAN 
NDTAY]EYKTCK\DGSVLGVTVTRTALLTHCQALTOACGYTEAE 
71 VK V UD F KKDVGLWHG I L7SVMNMMHV 1 S I P Y SLMKVN PbS W I 
gKVCQYKAK VACVTK fR DMHWALVAHRDQRDINLS SbRML I VADG 
ANPWSI S£ CDAFLNVFOS KGLRQE V I CPCASS PEALTVA IRRPT 
DDSNOPFGRGVLSMHGLTYGVIRVDSEEKbSVLTVCDVGLVMPG 
Al MC SVKP DG V PQLCRTDEIGELCVCAVATGTS Y YGLSGI4TKNT 
PEVFAMTSSGAPISEYPFIRTGLLGFVGPGGI>VFWGKMDGLKV 
VSGRRHNADD1VATALAVEPMKFVYRGRIAVFSVTVLHPERIVI 
VAEQR ?OSTEEDS FCWMSRVLQA J DS I HOVGVYCLALVPANTLP 
KTPLGGIHLSETKOI>FLEGSLHPCNVliMCPHTCVTNLPKPRQK0 
PEJGPASVMVGNLVSGKRIAQASGRDLGCIEDNDOARKFIiFLSE 
VLQWRAOTTPDHI LYTLLNCRGAI ANSLTCVOLHKRAE K I AVML 
WERGHbODGDHVALVY PPGI DL1 AAFYGCLYAGCV PITVR PPHP 
CN I ATTLPTV KM! VEVS RSACLMTTQLI CKLLRSR EAA AA VDVR 
T K PL I LDTDD * PKKR PAOI CKPCN PDTLA YLDFS VSTTGMLAG V 
XMSHAATSAFCRS 1 KLOCELYPSRE VA I CLDPYCGLG FVLWCLC 
SVYSGHOSIL3PPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CT KGLG S OTES LKARGLDLSRVRTCVWAEERPR I ALTQS FS Kh 
FKDLGLH PRAVSTS FGCRVNLAI CLOGTSGPD PTTVYVDMRALR 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
LGE 3 WVHSAKNASGYFT I YGDESLQSDHFNSRLS FGDTQT I WAR 
7GYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 
3 ETSVI RAHKS VTECAVFTWTNLLVVVVELDGSEOEALDL VPLV 
1-NWLEEHYLJVGWWVD3GVIP3NSRGEKCRMHLRDGFLADO 
LDP3YVAYNM 


5920 


1383 


1499 


QLGAVAHAGVSRI PP* LFPPLHPTFLSLWCLHHKLP/HPPGASM 
VRPPWPRRPPAHISSVROASTOVPRTVPHTQRVANIGTQTTGP 
SGVGCC7PGRPLLPCKCSSAAHSTYRVQEPAVH3 PGQEPLTASM 
IiAAAPLHEOKOMlGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLESPESLHAK3DEAVAVLQAHQAMEQPKAYMH 


5521 


727 


157 


VCPGTGGE * GLWGOLGGLPKETPLKPMDAFTGSGLKRKFDDVDV 
GSSVSNSDDE3SSSDSADSCDSLNPPTTA5 FTPTS 3 LKRQKQLR 
RKim?FDOVTVYYFARRCX5I^SVPSCX3GSSLGMAORHNSVRSYT 
LCEFAQEQEVNHRE I LREHLKEEKLHAKKMKLTKNGTVESVEAD 
GLTLDDVSDED1DVENVEVI)DYFF3jOPLPTKRRRALLRASGVHR 
I DAEE KOELRA 3 RL S R EECGCDCR LYCP PE ACACS 0 AG I KCQVD 
RMSFPCGCSROGCGNMAGR3EFKP3RVRTHYLHT3MKLELESKR 
Q\GAAC0PQV*GALPDCQL<3PDRSTGL+ DPS WIGS KGLS FTGKG 
AAATHLI I LRVI ENRGAEGKRK 


bS'22 


2475 | 4195 


sysnwglfpsvfiovprsrtgnlkpiflfysyyeVcmetlkgxt 
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SEQ 
ID 
NO: 


Predict ec 
beajnna nc 
nucleor id* 
locat i on 
correspond! no 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleot ioe 
location 
corresponding 
to first 
amino acjd 
residue ci 
amino acid 
sequence- 


Amine i cid segment containing siguc.1 peptide 
(AsA3c=:.: ne, C=Cysteine, D=Aspartic Acic, E= 
Glutamic Acid, F=Phenylalanme . G^Glycir.e., 
H=Hist i cine, 1 = 1 soleucine , K=Lys:ne. 
LsLeiici::e, M=Heth.i.onine, N'sAspa rapine, 
P=Proljne, G=Glutamine , S=Arginint. 
S=Serir,fc, ^Threonine, V^Valinc, 
Ws-Tryptcphan, Y-Tyrosine, X=Unknowr., * =Stop 
Codon, /=po3sible nucleotide deletion, 
\=possihie nucleotide insertion? 








CLYKATCViCVCSPRNDRPDACVNPSEPAATTVFHlRTGLLLGDT " 
SKIITR1 h EKEIPKQJTLRFDACAAINSKKLEIGCGSLN* ERS* 
RVF.NK^THESGVCKNCAYWPCVI * AT*KKNKNDSVY1»0KGEAN 

LI KGEVH Y- CS PXPVFQTFYEELNLPAPELLXKTKKLFLQLAENV 
IFLLNGTS C YVRGGTT I GDR WFWEA * EL VPTDP APD 1 1 P I * KAE 
ASNF * VLK7S 1 1 RQYCI AREGKDF I I P VGKPNC ] GQKLYNSTTK 

TTT» * K>V- , TF'*>JDP<SVTQVI .VTll* &u&pc:u* nWTVPQd V* Yf** 
111 L/lxM . j t»AT»r rsi^f SIvljiW J\ 'rtn/iton u^jv^owbrn, 

RHRA Y FR1 PNKWADSCV I GT1 KPS FFLLP 1 KMG ELLGFS V Y ASR 
EKXGI Vj GNWKDNEWPRERI ICYYGPATWAQDGSWGYR/TP/VY 
MLNWI I R L OAI LEI 1 SNETGRALTVLAWQETQMRNAI YQNRLAL 
DYLLVAEGGVCRKFNLTNCCLOI NCOGQWKMI VRDMTXLAHVP 
IQVWHK FDPESLFGKWFPAI GSFKTLI VGVLLV I RTCLLLPCVL- 
PLLFOM I K G 1 VATLVHQKTS AHVNYMNHYR S I S QHDSKS EDSSE 
NSH 


5923 


13? 


63C 


QLCGRRGOS FRTS I KRMHP1 » RTCPNTNL/ 1 1 LLSQENTQI RDL 
QQENR ELK 1 SLEEHQDALELIMSKYR KQMLQLMVAKKAVI>AEPV 
LKAH0SH£AE1ESQIDRICEMGEVMRKAVQVDDLX?FCKI0EKLA 

QLELENXELRELLS issesloarkensmdtagoai k 


592~4 




2146 


ekgkvkdagaeqwislslsckgswetq:- , sn)jlnsltpptsvrrm 
pl1ttvtl1 kmvarhhkkllcskafstqlqqkl flhsqkgihhq 

SVCMKLKPNTSHIISILMGQPKALVOLETI^PLTI I IQKFQTQD 

hmkfwkn lflhshbltps vpqtv i p k ktgspe 1 k lk i tkt 1 ong 
relfessicgdllnkvcase\o*nosiesrkjekrkksnk>;dssr 
seerksh:<2pklepeeqnrpnervdtvsekpkeefvlkeg?pss 
antifcswngsvhw\fkfqvgdlvk£kvgtyfwwpcmvs£dpql 
evhtkintrgareyhvqffsnqperawvhekrvreykghkqyeb 
li aeatkcasnhsekqki rkpr pqrkrac/wdj g i ahaekalxmt 
reerieoytfiyidkopeealsgakksvasktevkktrrfrsvl 

NTQPEQTN'AGEVASSLSSTEIRRilSORRHTSAEEEEPPPVKJAW 
KTAAARK ? LP AS I7MHKGSLDLQKCNMS PWK I EOVFALQNATG 
DGXF 3 DOF V YSTKG I GNKTE I SVRGCDRLI 1 STPNQRNEKPTQS 
VSSPEATSGSTGSVEKKQCRRSIRTRSESEKSTEWPXKKIKKE 
QVGFLHVE5 


592S 


2lt 


1911 


MMTAESR E ATGLSPQAAQEXDG 1 V 1 V KV EEEDEELSKHWGQDSTL 
QDTPPPDPF i FRQRFRRFCYQNTr GPh£lAijbRIjRr-LLnU"ijKrE» 
I NTKEO I LE I ;LVLEQ FLS I LP K£L0 VWLOE Y R PD5G E EA VTLLE 
nr.FT ni.Q^r n\7Pru^VHf2Dinv!i .2ii;f:MVPi.i')P\/nFCQQrnLHPEAT 
OSHFKKSSR X PR LLQSRALP AAH 1 PAP PHEG S PRDOAMASALFT 
ADSOAMVK 5 ZDMAVSLI LEEWGCONLARRULSRDNROFHYGSAF 
POGGEKRKENEESTSKAETSEDSASRGETTGRSOKEFGEKRDQE 
GKTGERQOKNPEEKTRKEKRDSGPAIGKJDKXTITCERGPREKGX 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRKK 
IIHTGEKPYECSECGKAF\SLNS\NLVLHORI\HTGEKPHECNE 
CG KA FSHSS N L I LHOR I HSGEK P YECNECG KAFSC5 S D\ LTKHQ 
R IHTGEKP Y EC SECG KAFNRNS Y LI LHR R VHTREKP Y KCTKCGK 
\AFTRSSTLTLHHR1 HARERASEYS PASLBAFGAFLKSCV 


5926 


z 


233 


DRCLMLKQGSOPGSPPAT/CEPPAPPVYQAPCOSCPEPPGAIJEP 
SDSPHHTPVKPPPEHSAACPAPATCCPPPRSSMS 


5927 


414£ 


1248 


KHFSKFGSO ALYOLKRPASGQNS 1 SVMPAOK ITKP AAKYG1 PLA 
Y XK YGDK K L H EKKPLQKHKQAKQTPEKR VNTGE £R R K 1 SEE AAR 
K RR LEFI E K E K XQKDQI I S LWKAEQMKRQE K ERLER 1 NRAR EOG 
KRNVLSAGC ?GEVKAPFLGSGGTI APS SFS SRGQYEKYHAI FVQ 
MQOORAEDKEAKMXREIYGRGLPERQXGOLAVEWVKOVEEFLQR 
KREA^ONKARAEGHHGILQNLAA>JYGGRPSSSRGGKFRNKEEEV 
YXARLRQIRLQNFITCRQQIKAJCLRGEKKEANHSEGQSGSEEADM 
RRKK\IESL}LAIiANARAAVLKEOLERK5cKEAYEREKKVWEEHLV 
AKGVXSSDVSPPLGCHETGGSPSK00MRSV2SVTSALKEVGVDS 

sltdtret5eemqktnnaisskreilrrlnenlxauedekgk0n 
lsdtfeinv>:edakehbkeksvssdrkkweaggqlvipldeltl 
dtsfstterh7vgeviklgpngsprrawgksptdsvlk1lgeae 
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ID 

NO: 


Predictec 
beg inn inc 
nueleot ice 
1 oca t i or. 
correspcndi nc 
to first 
amino acid 
residue of 
arr.ino acic 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
CO first 
ammo acid 
residue of 
amino acid 
sequence 


Amino acid .scoment containing signal peptide - 
lA=A±onine, C=Cyst.eine, D=Aspartic Acid, E= 
Glutamic Acic, F= Phenyl alam r.e , G-Glycine, 
H=Histidine, ;uisoleucine, KrLysine. 
L=Leucine, M = ?»lethi orine , N^Aspar^gint , 
P=Proline, O^Glutamine, RsAr oi nine.- , 
S=Serine, T=Threonine, V«=Valinc. 
W-Tryptophan, Y=: Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion. 
\=possib!e nucleotide insertion! 








LOLOTKLLENTTlKSElSPEGEKYKPLaTGEKKVOCISKEINPS " 

AIVDSPVE?KSPEFSEASP0MSLKLEGK1LE?DDLETEIZX3EPS 

GTNKDE\SLPCTJTDVWISEEKETKETQS;J5RIT1QENEVSEDG 

VSSTVDCI.SDIHIEPGTNDSOHSKCDVDKSVOPEFFI'HIT/VHSE 

HLNLVP0VQSVOCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 

LFPANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRODWLEIDEI 

EDEKIKEGPSDSEDlVFEETDTDLOEliOASMEOLLREOPGEEYS 

EEEESVLKNSDVEFTAN6TDVADEDDNPSSESALNEEVIHSDNSD 

GEI ASECECDSVFNHLEEbRLPXEQEMGFEKFF EVYEK1 KAIHE 

EEDEN1E1 CSKI VCKILGNEHQHLYAKI LHLV14ADGAY0EDNDE 


5928 


4^4 6 


1248 


KHFSKFGSQALYQLKRPASCQNS I SVMPAQK • TKPAAKYG I PLA 
YKKYGDKKLHEKXPL-OKHXOAHQTPEKRVNTGEERRK 1 SEEAAR 
KRRLEFIEKEKK0HDQ1ISLMKAEQMKRCEKERLERINRAREQG 
KRNVLS AGGSGE V KAPFLG SGGTI APSS FS S RGO YEH YHA I FDQ 
MOOORAEDNEAKWiCREIYGRGLPEROKGOLAVERAKOVEEFLQR 
KR E AMONKARAEGHMG I LQNI^AAMYGGR PS S S RGGKP RNKEEEV 
YLAR LRQI K LQN FNERQQ I KAXLRGEXKEANHS EGQEGS EEADM 
RRKK\ I ESLKAKANARAAVLKEOLERKRKFA YEREKK VWEEHLV 
AKGVKSSDVSPPLGQHETGGSPSKQQMRSV1SVTSALKEVGVDS 
SLTD7RETSEEM0KTNNAISSKREILRRLNENLKA0EDEKGK0N 
LSDTFEINVHEDAKEHEKEKSVSSDRKKWEAGGOLVIFLDELTL 
DTSFSTTERHTVGKVIKLGPNGSPRRAWGKSPTDSVLK1LGEAE 
LOLOTELLENTTIRSElSPEGEKYKPLnXiEXKVOCISKEINPS 
A1VPSPVETKSPEFSEASPQMSI,KLEGNLEEPDDLETE1L0EPS 
GTNKDE\SLPCT1TDVWISESKETKET0SADR1TI0ENEVSEDG 
VSST VDQLSD 1H3 E PGTNDSQH S KCDVDKS VCPE P FFHKWJ1S E 
H LNLV PQVQS VQCS P EE S FA FR SHSHL P F KN KN XNS LL I GLS TG 
LFDANN P KMLRTCE LPDLS KLFRTLMDVPTVGDVRQDNL E I DEI 
EDEK1KEGPSDSEDIVFEETDTDLQELQASME0LLREQPGEEYS 
EEEES VLKNSDVE PTANGTDVADEDDN PS S ESALNEE WHS DNS D 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEK1 KAIHE 
D5DEN I EI CS KI VCN I bGNEHQHl.YAK I LKLVMADGAYQEDNDE 


597.9 


3 


1558 


l.DFSMTTQLPAYVAI LLFYVSRASCQDTFTAAVYEHAA1 LPNAT 
LTPVS REEALALKNRNt J3 1 LEGAITS AADQG AH 1 1 VTPEDA1 YG 
WNFNRDSLYPYLED I PDPEVNW I PCN T NRNRFGCTPVQ2RLSClA 
AKNNS1YVVA^1GDKKPCDTSDPQCPPCK3RYOYNTDVVF\DSOG 
KLVf\RYHKONLFMGENOFN\'PKEPEIVTFNTTFGS FG I FTCFDI 
liFHDF AVTLVKDF KVDTI VFPTAWMNVLP KLSAVE FHS AWAMGM 
RVNFLASN I HY V S K KMTGSGT Y APNS S RAF H YTJMKTEEGKLLLS 
0LDS}IPSHSA\A7NWTSYASS1EA1>SSGMK£FK3TVFFDEFTPVK 
LTGVAGNYTVCCKDLCCHLS YKMSEN 1 PNEV YALGAFDGLHTVE 
GRY YLQI CTLLKCKTTNLNTCGDSAETASTR FEMFSLS GTFGTQ 
YVrPEVLLSENQiAPGEFQVSTDGRLFSLKPTSGPVLTATTLFGR 
LY EKPW ASNAS SGLTAQAR 1 1MLI VI API VCSLSK 


5930 


113 


60B2 


rgncfw2 vpftmaortgledperylfvdravi ynpatqadwtak 
klvw ifserhg feaas i keergdevmvelaengk kamvnxddi q 
kmn p p k fs xve dka eltclneas vlhnlkdr y ysgl i y tysgl f 
cwi mpyknlpi ys en 1 1 emyrgkkrhempfh i ya i ses a yrcm 
lodredcs i lctgesgagktentkxviqylaji vasshkgrxdhn 
i pge\lerqllqan pi les fgnartvqndns s r fgkf 1 rinfdv 
tgy i vgani etylleksravrqakdertfh j fyqllsg\ agehl 
ksdlllegfnnyrflsngyipipgoVqdkgnfrgdpgeaj^hikg 
fsheeilsmlkwssvlofgnisfkkerntdoasmpentvaqkl 
CHLLGMNVME FTRA I L7PRI kvgrdyvqkaqtxeqadfaveala 

KATYERLFRWLVHR 1 NXALDRTKRQGASF 3 G I LDIAGFE I FELN 
S FEQLC I NY TNEKLQQ LFNHTMF I LEQEE YQR EG I EWNFIDFGL 
DLQPCI DLI ERPANF PGVLALLDEECWFPKATDKTFVEiaVQEQ 
G SHS KFQKP RQLKDKAD FC 1 1 HY AGKVDY KADEW IjMKNMDPLND 
NVATLLHQS SDRF VAELWKDVDR I VGLDQVTGMTETAFG SA Y KT 
KKGMFRTVGQLYKESLTK1MATLRNTNPNFVRCI I PNKEKRAGK 
LDP HLVLDOLRCNGV LEG I R I CRQGFPNR1 VFQEFRQRYEI LTP 
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SEC 
JD 
NO: 


Predi ctec 
beginni nc 
nucleotide 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Gequencc 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue cf 
amino acid 
sequence 


Amino acid seonent containing signal peptide 
{A=Alanine, C = Cysteane, D--Aspartic Acid, En 
Glutarric Acid, F=Phenylaj anine, G^Giycine, 
H=Histidine, I-lsoleucine , K=Lysine, 
L^Leucine, M*Met hionane, K^Asparaame, 
P-Froline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Vcline, 
VUTryptophan, Y= Tyros me, *=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion* 








NA1PXGFMDGKQACERMI RAL.ELDPNL V R 1 GQS K 1 FFRAGVLAH 
LtKKRDLKITDIIlFFOAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLOVTKQEEEUJAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKN1 LAEQLQAETELFAEAEEM 
RARLA^KKQELEEILHDLSSKVEEEEERNQILQNEKKKMQAHIO 
DLEEObDEEEGAROKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
1 KEXKLMEDR I AECSSQLAEF.E EXA K NLAK1 RJJKGEVM 1 SDLEE 
RLKKESKTROELEKAKRKLDGETTDIiODQIAELQAQlDELKLQL 
AX K FEE LOG ALARGDDETLH KNN ALKWR ELQAQ I AELQEDFES 
EKASRNKAEKQKRDtSEELEALKTE^EDTLDTTAACOELRTKRE 
CEVAELKKALEEETK^EAQIODMRQRHATALEELSEQLEQAKR 
FKANLEKNKOGLETDNKELACEVKVL0OVKAXSEHKRKKLDAQV 
QELHAJC^SEGDRLRVELAEKASKLONELDNVSTLLEEAEXKGIK 
FAKDAAS LESQL0DT0ELL0EETR0K LNLSS R I RQLEEEKNSLQ 
EQQEEEEE ARKNLEKQVLALQSQLADT K KKVDDDLGT IESLEEA 
KKKLLXDAEALSQRLEEKALAYDKLEXTKNRLOOELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAE£KSISARYAEERDRAEAEARE 
KETKALS LARALEEALEAKEEFERQNXO LRADMEDLMSS KDDVG 
KNVHELEKSKRALEOQVSEEMRTOLEEijEDELQATEDAKLRbEV 
NK0AMKA0FERDLQTRDEQNEEKKRLL1K0VRELEAELEDERKQ 
RALAVAS KKKME I DLKDLEAQI EAAN KARDE V 1 KQLR KLQAQMK 
DYQRELEEARASRDEIFAOSKESEKKLKSLEAEILQLQEELASS 
E RARR1 IAE0ER DE LADEI TN S ASG KS AL I JDEKR R LE AR I AO LEE 
ELEEEQSN'-nELLNDRFRKTTLQVDTLNAELAAERSAAQICIDNAR 
QQLER ON K ELKAKLOELEGAV KS KFK AT I S ALEAK I GQLEEQLE 
OEAKERAAANKLVR R TEKKLKE1 FMQVE DERRKADQY KEOME KA 
NARMKQL K RQLEEAE SEATRANASRR KLQRELDDATEANEG LS R 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVKETQPPQSE 


5931 


113 


608s 


RGNCF W I V P FTMAQR TGLED P E R YL F V D RA VI YN P ATOADW TAK 
KLVWI PS ERHGFEAAS I KEERGDETVM\^1J^ENGKKAMVNKDD1Q 
KMNPPKFSXVEDMAELTCLNEASVLXNLXDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRREMPPHIYA1SSSAYRCM 
LQDREDOS I LCTG ES G AGKTEN TKKV1 Q Y LAHVAS S HXGR KPHN 
2PGE\LEROLI.iQANPlLESFGNARTVONDNSSRFGKF J RINFDV 
TGYI VGAN 1 ETYLLEKSRAVRQAXDERTFH I FYOLLSG \ AGEHL 
KSDLLLEG FNN YR FLSNGY I P I PGQ\QDK GNFRGDPG t AMH I MG 
FSKEEILSMLKWSSVLQFGN1SFKKEKNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAI LTPRI KVGRDYVOKAQTKEQADFAVEALA 
KATYERLFRWLVHR I NKALDRT KRQGAS F I G I LDI AG FE I FELN 
S FEQLC I NYTNEKLQQt»FNHTM F I LEQE E YOHEG I E WN FI D FGL 
DLOPC1DLI ERPANPPGVLALLDEECWF? KATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKAJDFCI I HYAGKVDYKADSWLMKNMDPLND 
NVATLLHQ S SDR FVAE LWKDVDR I VGLDQ VT6MTETA FGS AY KT 
KKGM?RTVGQLYKESLTKLMATLRNTNPK FVRC1 1 PNHEKRAGK 
LDPHLVLDQLRCNGVLEG IR I CRQGFPNR 1 VFQEFRQR YEILTP 
NAI PKGFMDGKQACERMI RALE LD PNLYR 1GQSK1 F FRAG VLAH 
LEEERDLXI TDI 1 1 FFQAVCRGY LARKAF^iKXOQQLSALKVliOR 
NCAAYLiaRHWQWWRVFTKVKFLLQVTRCEEELOAKDEELLKVK 
EKQTKVEXSELEEMERKJIO^LLEEKNIl^OI^AETELFAEAEEM 
RARLAAKKOELEEI LHDLESRVEEEEERNOI LQNEXXXMQAH 10 
CLEEQLDEEEGARQKLOLEXVTAFJVXI KXKEEEI LLLEDQNSKF 
I KEKKLMEDR I AECSSQLAEEEEKAKNLAK I RNKQEVM1 SDLEE 
RLKKEEXTR0ELEKAKRKLDGETTDLQDC1AEL0AQIDELKL0L 
AKKEEELOX5ALARGDDETXHKNNALKVV1?EIjQA0IAEL0EDFES 

ekasrnkaekqkrdlseelealktele3tldttaaq0elrtkre 
Oevaelxkaleeetknkeaoiodmrorhataleelseoleoakr 
fkanleknkogletdnkelacevkvlqovkaesehkrkkldaqv 
qelhax^/segdriavelaekaskloneldnvstlleeaekkgik 
fakdaaslesqlqdt0elloeetr0klnlssr1rqleeexkslq 

EC<)BEEEEARKNLEX0V1ALQS0IvADTKKKVDDDLGTIESLEEA 
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SEQ 
ID 
NO: 


Predicted 
beci nning 
nucleotid*- 
locct:on 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
1 oca t ion 
corresponding 
to rirct 
amino acid 
residue of 
amino acic 
sequence 


Axino acid segment conrammg signal peptide 
;>. -Alanine, C=Cysteine, D=Aspartic Acid, E = 
Cj utamic Acid, F= Phenyl alanj ne , G=Glycine, 
K-Kistidine , 1-Isolevcine, K^Lysine, 
L-;.eucinc f M*Mc thicni ne , N^Asparagine . 
P-Proline, Q=Glutami ne , R=Arginine, 
5= Serine, TVThreonine, V=Vcl;nt , 
fce Tryptophan, Y=Tyrcsme, X= Unknown, *=5top 
Codon, /=posslble nucleotide deletion, 
\s possible nucleotide insertion) 








XKKLLKX)A£Ai>SORLEEKAL^.YDKLEKTKJ4RLOX5ELDDLTVDLD 
I-:CROVASNLEKKQ\KKFDOLLAXEKSlSARVAEERDRAEAEARE 
KFj"KALSLARALEEALEAKEEFERONKOLRADMEDLMSSKDDVG 
ICr^ELEKSKRALEOQVXEEMRTQLEELLDELQATEDAKLRLEV 
KMCAMKA0FERDL0TRDECNEEKKRLL1 KQVREI.EAELEDERKQ 
R7\ LAVAS KKKME1DLKDLEA0 J E AANKARDEVI KQLRXLQAQMK 
DVQRiLEEARASRDElFAQSKESEKKLKSLEAEILOLQEiLASS 
EkARRHAEOERDELADElTN^ASGKSALLDEKRRLEARIAQLEE 
ELEEEQSNME^LNDRFRKTTLOVDTL^AELAAERSAAOKSDNAR 
OCLERQNKELXAKLQELEGAVKSKFKATlSALEAKlGQtEEQLE 
OEAKSRAAANKLVRRTEKKLKEZFMOVEDERRHADOVKEQMEKA 
NARMKOLKRQLEEAEEEATRAKASRRKLORELDDATEANEGLSR 
EVSTLWRLRRGGF1SFSSSKSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHI.EEJCFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FCATLAVGLTIFVLSVVTI 1 1 CFTCSCCCLYXTCRRPRPV\APP 
FKPP/P\AmAPYPQPPSVPPSYPGPSYQCYHTnPPQPGMPAAPY 
FMQYPPPYPAQPMGPPAYKETLAGGAAAPYPASOPPYNPAYMDA 
FKAAL 


5933 


1 


3190 


GTHKLK.MADKTPGGSQKASSKTHSSDVHSSGSSDAHMDASGPSD 
SDK,PSRTRPKSPRKHNYRNESARESLCDSPliONl,SRPLLENKLK 
AFS3GXMSTAKRTLSKKE0EELKXKEUEKAAAEIYEEFLAAFEG 
ST.-GNKVKTFVRGGWN/iAKF.EhETDEKRGKlYKpssRFADOKNP 
PNOSSNERPPSLLVIETKKPPLKKGEKEKKXSNLELFKEBLKQI 
CtERDERHKTKGRLSRFEPPOiTDSDGQRRSM.DAPSRRWRSSGVL 
DL V A PG SHDVGD PSTT\N F Y 1 XI- N 1 \NPOMNLKKCCCCE FGR FG P 
LA5VKIMWPRTDEERARERNCGPyAFMNFRDAERAI,KN],NGK>5I 
VJS F SMKLGViGKAV PI PPHP1Y 1 PPSMMEHTLPPPPSGLPFNAQP 
REKLKNPNAPMLPPPKNKEDFEKTLS0A1 VKWIPTERNLLALI 
HK M 1 EFWR EG PMFEAM 1 MNR E 1 NNPMFR FLFENOTPAHWY R W 
XLYS I LQGDS PTKWRTEDFRMF KNGS FWRP P PLNPYLHGMS EEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
I.KNAFAAFEIVDCITESL^I LKTPLPKKI ARIjYLVSDVLYN«5SA 
KVANAS YYR KFFETKLCO I FS DLNAT YR T I OG H LQSEN FKQRVM 
TCFRAWEDWAI YPEPFLI KLONI FLGLVNI 1 EEKETKUVPDDLD 
GAr 2 EEELDGAPLEDVDG1 PIDATPIDDLDGVPIKSLDDDLDGV 
PLCATEDS KKNEPI FKVAPS K'a^VDESELEAOAVTTSKWELFD 
OHZESEEEEN0NOEEESEDEEDT0SSKSEEm-;bYSNP3KEEMTE 
SKFSKYSEMSEEKRAKLRE3ELKVMKFQDELESGKRPKKPGQSF 
OECVEHYRDKLLOREKEKELERERERDKKDKEKbESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSHSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDV5KKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTY"WKCDLF 
LCPERSVF 


5934 


1 


3390 


GTRKbKMADKTPGGSQKASSKTRSSDVKSSGSSDAHMDASGPSD 

SD>!P£RTRPKSPR khn yrnesar eslcds phonlsrpllenklk 

AFSIGKMSTAKRTIjSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 

sdc-nkvktfvrggwnaakeehetdekrgkiyxpssrfadokwp 
pncssnerppsllvietkkpplkxgekekkksnlelfkeelkqi 
oeerderh ktkgrlsrfep pqs dsdgorr s mdapsr rnrssg vl 
ddvapgshdvgdpstt\nfylgni\npqmnlkkcccqefgrfgp 
1j^?vkimwprtdeerar£rncgfvafmnrrdaeralknlngkmi 

MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSG^PFNAOP 
REr.LKlJPNAPMLPPPKNKEDFEKTLSOAlVKWIPTERmjLALI 
HRK3 EFWREGPMFEAMIMNRE1 NNPMFRFLFENQTPAHVYYRW 
KLY S I LQGDS PTKWRTSDFRM FKNGS FWR P PPLNPYLHGMS EEQ 
ETEAFVEEPSKKGALKEEORDKLEEILRGLTPRKNniGDAKVFC 
LNNAEAAEEI VDCI TESLSI LKTPLPKKI ARLYLVSDVLYNSSA 
KV A*C AS YY R KFFET KLCQ I F S DLN ATYRT I QGHLQS EN F KQR VM 
TCr xAWEDWAIYPEPFLIKLQNl FLGLVNI I EEXETEDVPDDLD 
GAP I EEELDGAPLEDVDGIP I DATPI DDLDGVPI KSLDDDLDGV 
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SEQ 
ID 
NO: 


Predicted 1 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl ect ide 
location 
corre. spondi ng 
to tixst 
amine acid 
residue of 
amino acid 
sequence 


Ar.:\o acid segment containing signal peptice 
(A-Manine, OCysteine, D=Aspartic Acid, E = 
Giviamic Acid, F- Phenylalanine, G=Glycine, 
H = :- stidine, 3 = 1 scl eucine, K=Lyeine, 
L=i'. uczne, MsrKethicni ne, N=Asparagine , 
P=j-roline, Q-Gi utanunc, R--=Arginine, 
S-S;rine, T=Threonine, V= Valine, 
U= Tryptophan, Y=Tyiosme, X-Unknown, *^Stop 
Coc^n, /=possible nucleotide deletion, 
X^rssible nucleotide insertion) 








pl:>.tedskknepifkvapskwf:avdeseleaoavt7s:<welfd 
qh;l5eeeenqnqeeesedeedtqsskslthhlysnpikeekte 
sk?: kysemseeknaklrelelkvmkfqdelesgkkpkkpgcsf 
qec v'ehyt^dkllqrekekelekererdkxdkexlesrskdkkek 

DECTPTRKERKRRKSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKU5;SRSRSSHKDSPKDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KK5CKKSRSQSRSPHRSHKKSKGXTNTGRKFFKKAVTYWKCCLF 
LCTTRSVF 


593b 


3 


4493 

• 


$ Y ic: .£ GWRLS R PPRQ FWAGWR G I G R jFGTMAPVHGDDCE 1 G ASAL 
SDSi-SFVSSRARREKKSXKGRQEAbERLKKAKAGERVKYEVEDF 
TGV ; HEVBEEOYSKLVOARODDDWI VDDDG1GYVEDGRE1 FDDD 
LEF PALDADE KGKDGKARNKDKRNVKKLAVTKPNNI KSMFI ACA 
GKKTADKAVDLSKDGLLGDILODLNTETPQITPPPVWILKKKRS 
]GA.< PNPFSVHTATAVPSGKIASPVSRKEPPl/TPVPLKRAEFAG 
DDVCWESTEEEQESGAMEFECGDFDEPMEVEEVDLEPMAAKAWD 
K£Sl PAEEVKOEACSGKGTVSYLGSFLPDVSCWDIDOEGDSSFS 
VQEVCVDSSHLPLVKGADEEOVFHFYWLDAYEDOYNOPGWFLF 
GKVKIESAETHVSCCVMV.KNIERTIjYFLPREMKIDLNTGKETGT 
F 1 5 K KDVYEE FDEK 1 ATKY Kl M K FKSKPVE KNYAFE2 PDV PE KS 
EY1J VKYSAEMPQLFODbKGETFSHVFGTNTSSbELF^JRKlK 
GPCV...EVKKSTALN0PVSWCKVEAMALKPD1.VTJVIKEVSPPPLV 
V MA 1 • S MKTMQN AKNHQN EI I AKAALVKHS FALDKAAF KP P FQSH 
FCW5; KPKDCI FPYAFKEV1EKKNVKVEVAATERTLLGFFLAKV 
HKJ I PDI 3 VGHKI YGFELEVLL0RINVCKAPHMSK1CRLKRGNM 
FK).C : GKSGFGERKATCGRMICDVEISAKEL1RCKSYHLSELVOO 
ILKTERWIPMENIONMYSESSOLLYLLEHTWKDANKFIIjOIMC 
ELNxOPLAL0ITNIAGNIMSRTLNGGRSERNEFI*L»LHAFYEN>JY 
IVFi;KQIFRKPO0KLGDEDEEIDGDTNKYKKGRKKGAYAGGLVI. 
DFKVGFYDKFI I>LLDFNSLYPSI IQEFNICFTTVQRVASEAQKV 
TE IX : E QEQ1 PE LPDPS 1,F.MG 1 1 ,F RE I RKLVERRKQVKOXHXQOD 
IXP7AA I.QYDI RQKALKLTANSMYGCLGFSYSRFYAKPIAALVT 
YKG> E I LMHTKEMVOKKNLEVI YGDTDS IM I NTNSTN LEEVFKJL 
GNKVK SEVNKLYKLLE1 DI DGVFKSLLLLKKKKYAALWEPTSD 
GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGOILSDOSRDTIV 
EN J < - KR h I E3 G ENV LNG S V P VSQ FE I NKAI/TKDPQDY PDKKSLP 
HVI: \ A LNINSOGGRKVKAGDTVS YVI CODGSNLTASORAYAPEO 
LOKC -IKLTI DTQYY LAQQ 3 HPWAE I CEP1 DG 1 DAVL I ATG WEL 
\DF10FKVHHYHKDEENDALl.GGPAQLTDEEKYRDCERFKCPCP 
TCC^. TNI YDNVFDGSGTDMEPSLYRCSN1DCKASPLTFTV0LSN 
KL1 N.P 1 RR Fl KKY YPGWL1 CE£PTCRNRTRHLPLQFSRTGP1>CP 
ACMV^TLOPEYSDKSLYTOLCFYRYIFDAECALEKLTTDHEKDK 
LKKC F FTPKVLQD YRKLKNTAEQFXS RSG YSEVNLS KliFAGCAV 
KS 


5936 


1124 


13? 


RGEK'FDAEFRRFACLGFGERLQEFSRLLRAVHRSKAWCYLAI 
RMLKATCCPSPTTTAC'JTJPWORAPPLRLLVOKREADSSGLAFAS 
NSL/C RKKGL1 LRPVAPLRTR PPLLISLPQDFROVSSV I DVDLL 
PETHnRVRLHKHGSDRPLGFYI RDGMS VRVAPOG \LER VPGI FI 
SRLVF GGLAESTGLLA VS DEI LE VNGI EVAGKTLNQVTDMMVAN 
SHK\JIVTVKPA^QRNNVVRGASGRLTGPPSAGPGPAEPDSDDD 
SSDI V 1 ENROPPSSNGLSOGPPCWDDHPGCRBPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


2600 


PTSL. KSTVOLNiCRLLUDKRYQC V YSLAE1 FKVLASFY VILV3L 
YGL7S SYSLWWMLRSS LKQYSFEALREKSN YSDI PDVKNDFAFI 
LHl^.LQYDPLYSKilFSlFLSEVSEWLKQINL^EVfrVEKLKSK 
LVKN^ODKIELHLFMLNGLPDNVFELTEMEVLSLELIPEVKLPS 
AVSCLVNl>KELRVYHSSLVVDHPAIAFLEENLKILRl»KFrEMGK 
I PRVrv FHLKNEKELYLSGCVLPEOIiSTMOLEGFODLKNLRTLyL 
KSSLSRIP0VVTDLLPSLOKXSLDNEGSKLVV1JWLKKJ4VNLKS 
LEL; VCDIiERI PHS I FSLNNLHELDLRENNLKTVEEI ISFQHLQ 
NLS C l.-XL WHNJN I AY I P AQ I GALS NLEQLS LDHNN I ENL PLQLFL 
CTKL;-rYXDI^YhJHLTFIPEEIOYli\SNLOyFAVTNNN3EMIj 
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1 S E w 
I ID 
| NO: 

1 


Pr edi ct ec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot i dc 
location 
coi respond i ng 
to first 
amino acic: 
residue oi 
amino acic 
sequence 


Amino acid segment containing signal pepciue 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E- 
Glutarnic Acid, F-Phenylalanine, G«Glycine, 
H=Histidine, 1 =lsoleucine, X=Lysine, 
L^Leucjne, MsMethicnine, N=Asparaoine , 
P=Proline, C=Glutnmine, R^Arginine, 
S=Serine, T=Thr eonir.e , V=Valine, 
V) -Tryptophan, Y^Tyrosine, X=Unknown, * = Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








IiFOCKKLQCLLU5KNSLMNLSPHVGELS»bTHREP3G\l4YLETL " 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


| 553 8 
1 


3 9b 


186? 


yKGEGFFCNQEARGERRKKXKAMSSPNIWSTGSSVySTPVFSQK 
frrVWILLLLSLYPGFTSQKSDDDYBDYASMKTWVLTPKVPEQDV 
TVIUJNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTlKVbRLNSNMVGKlWlPDTFFRK 
SXKADAHWITTPNRNLR I HNDGRVLYSLRLTX DAECQIjQLHNFP 
MDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDTRSKRLYOFSF 
VGLRNTTE WXTTS GDYWMSVY FDLSRRMGY FTIQTYI PCTL3 
WLSWVSFWIN KDA V P AR TS LG 1 TT VLTMTTLSTI AR KS LP KVS 
YVTAMDLFVSVCF1FVFSAI,VEYG\TLIIYFVSNRKPSKDKDKKK 
KNPAPT1DIRPRSATI0MNNATHLQERDEEYGYECLDGKDCASF 
FC C FE OCR TGA W RHGRIHIRI AKMD S YAR I F F P7AFCLFNLVYW 
VS V LYL 


593? 

► 


66 


14 04 


1RPGYLKEV0KMSPGHRAGLEPFFDF1VSINGSRLNKDNDTUKD 
LLKANVEK PVKMLIY S S KTLELRETS VTP S N LVJGGQGLLG VS IR 
FCSFIXSAKEtrwHVLEVESNSPAAlAGLRPHSDYIlGADTVMNE 

s e dl fs l 1 et)} e ak p l k l y v yntdtdn cre v 1 1 tpns awggegs 

vqlssvnppsls ppgttgi eqsltglsisstp\ pavssvlstgv 
ptvp\llpp0vncsltsvppmessyiihlpglmpftrqglpnl>po j 
pstfnlpr\ptkswpgvglyoefvkpgvlpplssmpprnlpg\i 1 
aplplpseflps fpbvpesssaassgellsslpptsnapsdpat 
ttakadaassltvdvtpptakapttvedrvgdstpvsekpvsaa 1 
vd anas esp , 


5940 


145 


71-, 


rrsasrsasprosagtavttgttiaggtclaaahhrmrwradgrs 

LEKLPVHMGLV3 tev eqe PSFSD1 ASLWWCMAVGI SY 1 SVYDH 
QG I FXRNNS RLMDE 1 LKQQQELLGLDCS KY SPEFANSNDX DDOV 
LNCHLAVKVLS P E DG KAD I VRAAQDFCQLVAOXQ KR PTDLD VDT 
LA\ VYLVQMWL 1 1*1 


5941 

j 

i 
i 
j 

i 


13 


6147 


mclgrmgassprspepvgppapglpfccggsllawVllalpva 

WGQCNA?EW\ LP FAR PTNLTPEFEFP IGTYLN YECRPGYSGR PF 
S 1 1 CL KNS VWTGAKDRCRR KS CRNPPDP VNGKVHVI KG IQFGS 0 
IKYSCTXGYRLIGSSSATCIISGDTVIWDNETPICDRIPCGLPP 
TITNGDFI STPfREN FH YGSWTYRCNPGSGGRKVFELVGEPS I Y 
CTSNDDgVGI WSGFAPCC1 2 PNKCTPPNVENGI LVSDNRSLFSL 
NEWEFRCQPGFVMXGPRRVKCOALNKWEPELPSCSRVCQPPPD 
VLHAERTORDXDN FS PGOEVFY 5CE PGYDLRGAASMRCTPQGDW 
S PAAPTCE VKS CDDFMGOLLNGRVLFPVNLQLGAKVBFVCDEGF 
0LKGSSASYCVLAGMESLWNSSVPVCEQ1FCPSPPV1PNGRHTG 
XPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPOGNG 
VWSSPAPRCGI LGHCOAFDHFLFAKLXTQrNASDFP I GTSLK YE 
CRPEY YGRPFS I TCLDNLVWS S PKDVCKRXS CKTP PDPVNGMVH 
VITDI<?VGSRINYSCrrGHRLIGH5SAECILSGNAAH!VSTXPPI 
CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 
FELVGEPSIYCTSNDD0VG1WSGPAP0CIIPNKCTPPNVENGIL 
VSDNRSLFSLNEWEFRCOPGFVMKGPRRVKCQALNXWEPELPS 
CSRVCQPPPDVLH/iERTQRDKDNFSPGQEVFYSCEPGYDLRGAA 
SKRCTPOGDWSPAAPTCEVXS CI>DFNGQ1>L*K3RVLFPVNLQLGA 
KVDFVCDEGFOLKGSSAS YCVLAGMESLWNSSVPVCEQI FCPS P 
PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDL1GEST3 
R CTSDPQGNG VWSS PAPR CG I LGHCOAPDHFLFAKLKTQTNAS D 
FPIGTSLKYECRPEYYGRPFSITCLONLVWSSPKDVCXRXSCKT 
PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 
TAHWSTKP P I CQR 1 PCGLPPT I ANGDFI STHR3NFKYGS WT YR 
CNLGSRGRXVFELVGEPSIYCTSin>DQVGlWSGPAPQC:iPNKC 
TPPNVENG 1 LV5DNR S LFSLTJE WEFRCQPG FVMKGPRRVKCQA 
LWKWEPBLPS CSRVCOP PPE I LHGEHTPSHQDN FSPGOEV?YS C 
EPGYDLRGAASLHCT POGDWS PEA PRCAVKS CDDFLGOLPHGRV 
LFPLWLOLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWNNSVP 
VCEHI FCPNP PA I LNGR HTGTPSGD I PYGXEI S Y TCDPH PDRGM 
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" SEQ " 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
lecat ion 
ccr responding 
to first 
arrmo acid 
residue of 
atr.ino acid 
sequence 


Amino ecig s««ment containing signal peptide 
(A=A2an ir.e, C=Cysteme, D=Aspartic Acid, E= 
Glutamic Acid, F^ Phenylalanine , G=Glycirif-, 
hVHistid-ne, 3=3soleucine, K=Lysine. 
L» Leucine, K«Ple thionine , N^Asparagine , 
P=Proline, p-Gl utamine, R«Arginine, 
S^Serine, T=Threonine, V= Valine, 
K-Tryptcphan, Y*?yrosine, X=Unknovm, *=5tcp 
Codon, /-possible nucleotide deletion, 
\=possibje nucleotide insertion) 








TFNLIGESTJRCTSDPHGNGVKSSPAPRCELSVRAGHCXTPEQF 
PFASPT3 PINDFEFPVGTSLNYECRPGYFGKMFSISCLENLWfS 
S VRDNCRR KS CG P F PE PFNGMVH I HTDTOFG STVN YSCNEGFRL 
3GSPSTTCLVSGMNVTWDXKAPICEIISCEPPPTISNGDFYSKN 
RTS FHNGT WTYOCKTGPDGEQLFELVGERS 3 YCTSKDDQVGVK 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEI1RFRCQPG 
FVMVGSHTVOCOTNGRWGPKLPHCSRVCQPPPEILHGEHTLSHO 
DNFSPGOEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
CUDFLGOLPHGRVLLFLNLQLGAKVSFVCDEGFRUCGRSASHCV 
LAGMKALWNSSVPVCEQI FCPNPPAI LNGRHTGTPLCDI PYGKE 
VSYTCDPHFDRGMTFNLIGESTIRRTSEPKGNGVWSSPAPRCEL 
PVGAACPHPPKI0NGHYIGGHVSLYLPGMT1SYTCDFGYLLVGK 
GF 1 FCTDQG I WSQLDHYCKEVNCSFPLFMNG 3 SKELEMKKVYHY 
GDYVTLKCEIXJYTLEGSPVJSOCCADDRWDPPLAKCTSRTHDAIjl 
VGTLSGT2 FFI LLI 1 FLSWI I LKKR KGNN AKENPK EVA 1 HLKSQ 
GGSSVHPRTLOTNEENSRVLP 


5942 


4509 


688 

1 


YLY\T*MRANP1J^YGISHKAYQ3DPPL\RKHREQ\LV1E\VGRKL 
DXYAQMI RFEERTGYFS STDLGRTAS HYY I KYNTI ETFN ELFDA 
HKTEGDI FA 2 VSKAEEFOO I KVREEEI EEIiDTLLSNFCELS TPG 
GVENSYGKIN1 LL0TYINRGEMDSFSLISDSAYVA0NAAR1 VRA 
LFE1 ALR KR V4 P TMT Y RLIjNLS KA I DXRLWG W AS P LRQFS 3 L? PH 
MLTRLEEKXLTVDKLKDMR KDEI GH3 LHHVN3 GLKVKQCVHQ3 P 
S VMMEAF1 OF J TKTVLR VTLS 3 YADFTWNDQVUGTVGE P W W I WV 
EDPTNDH I YHS E Y FLAL KKQV I SKEAQLLVFT3 P I FEPLPS QY Y 
3 RAVSDRWI/3AEAVCI IKFQHlil LPERHP PHTELLDLQPLP 1 TA 
1.GCKAYE ALYKFSHFNPVOTQ 3 FHTLYHTDCKVLLGAPTGS G XT 
VAAELAI FRVFNKYPTSXAVYIAPLKALVRERMDDWKVR1 EE.KL 
GKICVIEliTGDVTPDMKSlAKADLIVTTPEKWCGVSRSWCNRNYV 
QQVT 3 LI 1 DE 3 H LLGEERG PVLEVIVSRTNFI SSHTEKP VR 2 VG 
LSTAIANARDLADWLN 2 KQMGLFNFR PS VRPVPLEVHI OG F PGQ 
H YCPRMAS MNK P AFCA IRSHSPAKPVLI FVSSR RQTR LTAL ELI 
AFLATEEDPKQKLNMDEREMENI IATVRDSr^LKLTLAFG3 GMHH 
AGUHERDRKTVEELFVNCKVQVL1ATST1AWGVNFPAHLV3 1 KG 
TEYYDGKTRRYVDFPITDVl^MHGRAGRPOFDDQGKAVILVHDI 
KXDFY KK FL YEPFP VESSLLG VLSDHLNAE I AGGT I TSKODALD 
Y ITWTYFFRRLIMNPSYYNI/JDVSHDSVNKFLSIILI EKSL3 ELE 
1J5YC2 El GEDNRS 3 EPLTYGRI ASYYYLKHQTVKMFKDRXKF EC 
STEELLS 3 LSDAEE YTDLPVRHREDHMNS EIoAXCLP IESNPH S F 
nSPHTKAHLLU^ARI^RAMLPCPDYOTDTKTVWALRVCOAML 
DVAANQGMLVTVLN I TNL2 QMV I QGR WLKDSS LLTLPN I ENHHL 
HLFKKWKPIKKGPHARGRTSIECLPELIHACGGKDHVFSS^VES 
ELHAAKTXOAWNFLSHLPEI NVG I SVRGSWDDLVEGHNELS VST 
LTADKRDDNKW I KLHADOEYVLQVSLQR VHFGFHKGKPESCA VT 
PRFPKSKDEGWFL3LGEVDKRELIALKRVGY2RNHHVASI>SPYT 
PEI PGRY 3 YTLY FMS DCYLGLDQQYD /KLSQRYTS ES FCTGQHQ 
Gh 


5943 


1 


2274 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWOTWLPNHVVFLRLR 
EGLKNQS PTE AEK FAS S S LPSS P P PQLiTRNW FGLGG ELFL WD 
GEDSSFL\TVRLRGPSGGG\EEPALSQYORLLCINPPLFEIYOVL 
LSPTQHHVALI GI KG1WVX.ELPKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAKYPSEILDPHVVLLTSDNVIRIYSLR 
EPQTPTNVI I bSEAEEESLVLKKGRAYTASLGETAVAFDFGPLA 
AVP KTLFGONGKDEWAY PLY3 LY ENGETFLTY 1 SLLH S PGN / 1 
WKA VGS 1AHAS \ AAEDN YG YDACAVLCLPCVPNI LVI ATESGML 
YHCVVLEGEEEDDHTSEKSVnJSRIDIilPSLYVFECVELELALKL 
ASGEDDPFDSDF5CPVKLHRJDPKCPSRYHCTHEAGVHSVGLTWI 
HXLH KFLGSDEE DKDSLOELSTBOKCFVEH ILCTKPLPCROPAP 
IRGFVIVPDI3^PTM3CITSTYBCLIWPLLSTVHPASPPLLCTR 
EDVEVAESPLRVLAETPDSFEKHIRSIbQRSVANPAFLKASEKD 
1APPPEECK>LLSRATOVFREOY3LKODLAKEEIQRRVKLLCDO 
KK KQLEDLS YCR E ERKS LREMAER LADKYEEAKE KOED I MNR MX 
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SEQ 
1 D 
NO: 


? recii ctec 

b<T C* ^ ^ 11 1 i\o 

nucleotide 
locecior. 
corresponding 
to first 
omino acic 
residue of 
amino acic 
sequence 


Predicted end 1 

ni i r 1 pnh l #- 

location 
corresponding 
to first 
amino acid 
reciaue of 
amino acid 
sequence 


Ammo tciU seyrc.er.t containing signal peptide 

Glutamic Acid, F= Phenylalanine , G-Glycane , 
Hs Hi st i cane, I-lsoleuca ne , K* Lysine, 
L-sLeucine, M=Methionine, NcAsparaoine, 
P-Proline, Q-Glutnmine. R=Arx;ininc, 
SsSerinc, T=Threonine, V=Vplane, 
W=7ryptcphan, Y=Tyiosine, X -Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHSFK? ELPVLSDSERDMXKELQLI PDC/LRH LGNA1 KQVTMK 
KD YO 00 K**1F KV LSLP K P T 1 1 LSAYQRXC2 G/S 1 1,K L EG EH I REMV 
KQINDIRMKVNF 


5944 


167 


342B 


FS3ATFTDEFEVLTEFPSATT1TT3GISATKTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPEKV0I1FDDPLPISYS0PEKVNGESKS 
SSTSESGDSDKMR15SCSDESSNSNSSKKSDNHSPAWTTTVSS 
JCXQPSVL V TFPKEERKSVSC KAS I KLS ETI SEG7SNSLSTCTKS 
GPSPliSS P:vGKL?VASPKRGOKREEGWKKWKRS KKVSVPSTVI 
PRVJ GRGGCN1 NAI REFTGAH I Dl DKQKDKTGDR J I TI RGGTE5 
TRCATOMKALIKDPDKEIDEIJPKNRLKSSSANSKIGSSAPTT 
TAAN TSLMG I KMTTVALSSTS QTATALTV PAISS ASTHXT 1 XNP 
VN \NV R PG FPVS FP\ I AY PP POF AHALLAA0TFO01 RPPR LPMT 
HFGGTKPFAOSTVIGPFPVRPLSPARATNSPKPHMVPRRONQNSS 
GSQVWSAGSLTSSPTTrTSSSASTVPGTSTNGSPSSPSVRRQLF 
VTV\'KTSTCA'ITTTVTTTASN1JNTAPTNATYPMPTAKEHYPVSS? 
SSPFPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPN^SSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRKTV p P LATSSAPVAVPS TAPVT Y FMPOTPMGCPQPTPKMET 
?A1 R PPPHG1TAPHKNSAS VQNSSVAVLS VNHI KRPHS VPSS VQ 
^PSTLSTO^ACQNSVHPANKPlAPNFSAi'LPFGPFSTLFENSPT 
SAHAFWGGSWSSOSTPESMLSGKSSYliPNSDPUHOSDTSKAPG 
FR PF LOR P A PS PSG I VNMDS P YGSVTP SS THLGN F ASN I SGGQM 
YGPGA PLGGAPAAAN FNRCH F£ PLSLLTPCSSAt NDS SAQS VSS 
GVRAPSPAPSSVPLGSEKPSNVSODRKVPVPIGTERSARIRQTG 
rSAPSVJGSNLSTSVGHSGJWSFEGJGGNODKVDWCNPGMGNPM 
IHRPMSDPGVFS0H0AMERr5TGIVTPSGTFH0HVPAGY*MDFPK 
VGGMPFSVYGNAMIPPVAP3 PDGAGGPI FNGPHAADPSWNSLIK 
MVSSSTENNGPQTVHTGPWAPHMNSVHMNOLG 


5945 




197 


GVTHLFLFGKRKLRNGIAEDLKGOADFFFbLVSEAVn/ATGSPRA 
WLTCLI LPL PGII FS VLPKAHSRPLL1 TFTPATDPSDLWXDGOQ 
0POPEKPESTLDGAAARAFYEAL1GDESSAPDSORS0TEPARER 
KRXKRR 3 MKAPAAEAVAEGASGRHGQGR SLEAEDKMTHR 1 LRAA 
OEGDLPELR RLLEPHEAGGAGGNI NARDAFWWTPLHCAARAGQG 
AAVS Y LLG RGAAWVGVCELSGRDAAQLAEEAGFPF VARMVRESH 
GET R S FEN R SPTPSLQY CENCDTH FQDEN) i RTS TAK LLS LSOG P 

LKRDQEGLGYRSAPQ PR VTHFPAWDTRAVAGRE\0 PPRVATLSW 
KEERKREE \ KDRAWERDLRT YMNLEF 


5S46 


; 541 


1666 


ilgs yssiopeeys\$wc\ewlodlla\yvspk\hsylrdlp 
s egspqrvns i dfv\el\ ehlqpdvlvhav lrwef /tl lteav 
ysyrg okq k kvmltveqaodqhyalvlvigpgaaw\ y pqlqrkkg 
yiwefkybfvocnytlenlelhttpwssceclfdddiraitfka 

KFOKEAPS F VKI SDLATHLEDKCSGWL I KAQ I S ELAFP I TPSQ 
KIAI.K*AH£SLKSIFSSLPN2VYTGCAKCGL£LETDENRiyKQCF 
SCLPFTMKK1 YYRPALMTAIDGRHDVCIRVESKL1EKILLNISA 
DCLNRVIVPSSEITYGMWADLFHSLLAVSAEPCVLKIOSLFVL 
DENS Y PLOODFSLLDFYPDI VKHGANAR L 


5947 


3 


1317 


RG I PDRRRRG P I GRVNMDLENXVKKMGLGHEQGFGAPCLKCKEK 
CEGFELHFKRKICRNC\NVAKKSM/TVLLSNEEDRKVGKXF3DT 
KYTTL I AK LKSDGI PMYKRNVMI LTNPVAAKKNVS I NTVTYEWA 
F p VQNOALAJROYMOMLPKE KQPVAG S EGAOYRKXCLAKQLPAHD 
CDPSKCHELSPREVKEMEOFVKKYXSEALGVGDVKLPCEMDAQG 
FXQMN 1 PGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
EPAI Y AERAGYDKLWHPACFVCSTCHELLVDM I Y r WKNBKLYCG 
RHYCDSEKPRCAGCDELIFSMEYTQAENONWHLKKFCCFDCDSI 
LAG El YVMVNDKPVCKPCYVKNHAVVCOGCKNAIDPEVQRVTYN 
NPSWKASTECFLCSCCSKCL1GQKFMPVEGMVFCSVECKKRMS 


5948 


29 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPZDg 
G^HYOMRRKGRCHRGSAARHPSSPCSVKHSPTRETXrYAOAQRM 
VEI E J EGRi KRIS I FDPLE2 1 LEDDLTAOEKSECNSMKENSERP 
PVCXRTKlWKinWVKKKNEALPSA^GTPASASALPEPKVRIVEy' 
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BNSDOCID: *WO 0153312A1_I_> 



WO 01/53312 



PCTVliSOO/34263 



r~pfe— i 

ID 
KG: 


~r recictec 

bfroinninc 

nucieotic 

location 

ccr respond : ng 

tc first 

oir.ino acic: 

residue ci 

arr.i no acic 

sccruence 


Predicted end 
nucl ectide 
location 
corresponding 
fo first 
amino acid 
residue of 
amino acid 
sequence 


Am: nc acid segment con:aimny sacnul pep 1 1 dc 
(A=A.£r«ine, OCyst.eine, D=Aspartic Acid, E* 
GJiUnuc Acid, F=Phenylalanine , G^Glycint:, 
H = K: ft idirie, 1 = Isoleucme, K»Lysinc, 
1 =;_- j< :ne, M-Methionine, N=Aspare.eine , 
Pyridine, O-Gluiamine , R^Arginint, 
S»&cri:te, T= Threonine, V=valine, 
VJsl ryptophan, Y=Tyrosine, X=Unkncvn, *rStop 
Cocoon, /^possible nucleotide deletion, 
\sporsib2e nucleotide insertion) 








S P ? £ A FKRP PVYYK F I E X S AEEXDNE VE YDMDEED YAW LEI VNB 
KRKODCVPAVSQSMFEFLMDRFEKSSHCENQKOGEQQS LI DEDA 
VCC: C V .DGECQNSNVILFCDMCNLAVHCECYGVFY1 PEGQWLC/ 
F^Ci<:CRARPADCVLCPNKGGAFKKTDDDRVlGHV\VCAL,W\ J P 
E \VG F AKTV F I EP 1 DG VRNI P? AR WKLT \ CNLCK E KGR / VGAC 1 
QC H K *.N C V T AFKVT CAC; KAGLY M KME PV KELTGGGTT F S VR K T A 
YCDVHTFPGCTRRPIjNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
K KAX K * X KALAE PCAVEPTVCAP Y I P PQR LNR I ANQVA1 QKK KQ 
FVEKJ-.^SVV^LKRLSRNGAPLLRRLOSSbOSOR^SOORENDEEM 
XAAKF Kl KYW0RLRHDLERARLL1ELLRKREKLKREQVKVE0VA 
ME1R I >T P LT VLLR SV LDQLQDKD PAR I FAOP VS LK EVPD Y I jDH 1 
KHPKDFATMRKRLEAOGYKNIoHEFEEDFDLIIDNCKKYNARDTV 
FYRAA VR LR DQGG WLRQAR R EVPS I GLK EASGMHLP ER PAAA P 
RRFr5WEDVDRLLDFANRAHLGLEE0LRELLDMLDLTCAf>!KSSG 
SR5KKAK1LKKEIALLRNKLSG0HSQPLFTGPGLEGFEEDGAAL 
GPEAGL'EVapRLETLLQPRKRSRSTCGDSEVEEESPC-KRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLiCDSS 
FNA PKCGHG KPALVRRHTLEDRSEL I SCI ENGtfYAKAAR ! AAEV 
COSSMWJ GTDAAASVLEPLKVVWAKCSGYPSYFALI IDPKKPRV 
PGHHNGVTIPAPPLDVLKJ6EHMOTKSDEKLFLVLFFDNKRSW0 
WLPKf KMVPLG J DETI DKLKMMEG'RJNSS J RKAVK 7 AFDRAMNH1 
SRVKC r. PTSDLSDI D 


5949 


39 


3370 


YRERVPVyGGSVLRSALEVCWDFI.SGLTEGSLLPEGFKSGPIDO 
GrvHYCKRRKGRCHRGSAARHPSSFCSVKJI^PTRETLTYAOAQRM 
VEI E : FG RLHRIS I FDPLE I I LEDDL7AQEMSECNSNKENSKRP 
PVCLF. ? K 3 H KNNR V KKKK EAL PS AHGT PAS AS ALPE PK VJU V 5 Y 
SPVSAFRRPPVYYKFI EKS AESLDKEVEYDMDEEDYAWLEI VNE 
KRKCPCVPAVSOSMFEFLMDRFEKESHCENQKQGEQOSLIDEDA 
VCC1CKDGEC0NSNVI LFCDMCNLAVHQECYGVPY I PEGQWLC/ 
RAKCL»eSRARPADCVLCPNKGGAFKKTDDDRWGirV\VCALW\ I P 
E\VGFANTVFIEPIDGVRN1PPARWKLT\CNLCKEKGR/VGAC1 
OCHKANCYTAFHVTCAOKAGLYMKWEPVKELTGGGTTFSVP.KTA 
YCDWTPFGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAXKA K KALAE PCAV LPTVCAP Y I P PQR LNR I ANQ V AT OR KKQ 
FVER^^'i' YS'LLKRLSRNGAPLLRRLOSSLOSQRSSQQRENDEtM 
KAAKE K1,K V WQRLRHDLERARLL I ELLR KR E KLKR EOVKVEOVA 
MELR1- 1 PLT'/LLiRSVLDOLODKDPARl FA0PVSLKEVPDYLDH1 
KHP>*PV-ATK;RKRLEAOGYKNLHEFEEDFDL3 1 DNCMXYNARDTV 
FY RAM'K LRDQGGVVLRQARREVDSI GLEEASGKHLPERPAAAP 
RR P KS v: E DVDR LLD PAN RAHLGLEEOLRELLDMLuLTCAMKS SG 
SRSWV^KLLXKElALLRNKLSOOlSQPLPTGPGLEGFEEDGAAi 
GPEAGEEVLPRLETLLOPRXRSRSTCCDSEVEEEEPGKRLDAGL 
TNGFGGAFSEQEPGGG1X5RKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELISC3ENGNYAXARRIAAEV 
GQSSMWI 5TDAAASVLEPLKWV/AKCSGYPS YPALI IDPKMPRV 
PGHHWGVTJPAPPLDVLKIGEKMOTKSDEKLFLVLFFDNl«SWQ 
WLPKSKMVPLGIDETIDKLKWMEGRNSSIRKAVR1AFDRAMNHI* 
SRVHGEPTSDLSDID 


5950 


il66 


3 73 


ESRSL7M£TSQPGACPCQGAA£RPAILYALLSSSLKAVPR?R£R 
CLCROHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLOCCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SS SGGSGC LPDR PQPSLAAVQWLQCCLES FW SLELS PKE \ YACL 
KG P I h FN F >E VPGIiQAASHI GHLQOEAHWVLCEVLEPWCPAAQGR 
LTRVLI.TAS TLKS I PTSLLGDLPFRP I IGDVDI AGLLGDMLLLR 


5951 


143 


5449 


WNV K P S LL WQLFKFS D KEEHEQNDS I SGKTGETG VEEMI ATRK 
VEODSKETVKLSHEDPHILEDAGSSDISSDAACTNPNKTENSLV 
GLPS C\ r DEVTECNLEX»KDTMGl ADKTENTLERHKI EPI GYCEDA 
ESKR0LESTEFNXSNLEWDTSTFGPESM3LENAICDVPD0NSK 
OLNAi ESTX1 ESMETANLODDRNSOSSSVSYLESKSVKSKHTKP 
VIKSK0W47TDAPKKIVAAKYEV1HSKTKVNVXSVXRNTDVPES 
QQNFHK PUKVRKXQIDKE PKI QSO*SGVK^VKNQAHSVL.KXTLQ 
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BNSDOCID: <WO 0153312A1J_: 



WO 01/53512 



PCT/tJ$00/342f>3 



SEC • 
ID 
NO: 


Predicted 
beci nr.ir.c 
nucleotide 
location 
cct responding 
to first 
axTii no acid 
residue of 
on): no ac: d 
scr.jencz 


Predicted end [ *mno acid Segment containing signal peptide- 
nucleotide | {.^-Alanine, C=Cysteine, DsAspartic Acid, E = 
location i Glutamic Acid, F=Phenylaianine, G^Glycine, * 
corresponding j H-Kistidine, 1=1 soleucine , K=Lysine, 
to first | L-Leucine, M=Methior.me, N-Asparagine , 
amino acid F*Proline, Q=ClutanuRe, Rs=Argininc , 
residue ct \ £=Serinc, T=Threcnine, V=Valine, 
amino acid ! W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
sequence ' Ccdon, /^possible nucleotide deletion, 
I \r possible nucleotide insertion) 








DOTLVCIFKPbTMSLSDKSHAHPGCLKEPHHPAQTGHVSKSSQK 
OOlKPOOOAPAMKTNSHVKEELEHPGVEHfKEEDKliKLKKPEKN 
LQPRQRRSSKSFSLDEPPLF1FDN1AT3RREGSDHSSSFESKYM 
MTFSKCCGFCKKPHGHRFMVGCGRCDDWFHGDCVGLSLSQAOOW 
GEEDKEYVCVKCCAEEDKKTE1LDPDTLENQATVEFHSGDKTME 
CEXI Gl S KHTTNDRTKY I DDTV KH KVK I LKR ESGEGRNS S DCR D 
NEI KKWQLAPLRKMGQPVjLPRRSSEEKSEKI PKESTTVTCTGEK 
ASK PGTK E KQEMKKKKV \ E KG V LNVHPAAS ASKPS ADQI RQSVR 
fiSLKDILMKRLTDSNLKVPEEKAAKVATKlEKELFSFFRDTDAK 
yjC^KYRSLMFNLKDPKNNlLFKKVLKGEVTPDHLlRMSPEELAS 
KF.LAAWRRRENRHTI EMI EKEQREVERRPI TXITKKGEI EI ESD 
APM KEQEAAME I QEP AAN KSbE KPEGS EK\ R KEEVDSMSKDTTS 
OK R QH LFDLNCK 1 CIGRMAP P VDDLS F K KVKVWGVAM KHS DNE 
AESIADALSSTSNILASEFFEEEKQESPKSTPSPAPRPEMPGTV 
KVES TFLARLN F I ViKG FI NMPS VAKFVT KAY ?VSGS PEYLTED1> 
PDS I QVGGRI SPQTVWDYVEK I KASGTKE1 CA/RFTPVTEEDQI 
SYTLLFAYFSSRKPYGVAANNMKOVKDMYLIPLGATDKIPHPLV 
P FDG PGLELHRPNLLLGLI I RQKLKRQHSACASTSK I AETPESA 
PP j ALPPDKKSK1 EVSTEEAPEEENDFFNSFTTVLWXORNKPOO 
NI.OEDLPTAVEPLMEVTK0EPFKPLRFLPGVL1GWEKOPTTLEL 
ANKPLPVDDI L0SLLGTTGOVYDQ\ AOSVMEQNTVKBI PFLNEO 
TNS K I SKTDNVEVTDGENKE1 KVKVDN J SESTDKSAEI ETS WG 
SSSI SAGSLTSLSbRGKPPDVSTEAFLTNLS 1QSKCEETVESKE 
KTLKROLCED0ENNLQDNOTSNSSPCRSNVGKGNIDGNVSCSEN 
I.VAKTARSPQFINLKKDPR0AAGRSOPVTTSESKDGDSCRNGEK 
UMLPGLSHNKEHLTEQJNVEEKbCSAEKNSCVOQSDNLKVAQNS 
T SVENlOTSQAEOAXPbOEDl LMQNI ETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKSITFTSRSTvSPRTSTNFSPMRPOCPNLQHLKS 
SPPGFPFPGPPNFPPOSMFGFPPHL?PPLLPPPGFG\FA\ONPM 
VPWFPW\HLP\GQPQRMWGPLS0A5RY1GP0NFY0VKDIRRPE 
RRHSDPWGRQDOQObnRPFNRGKGDRORFYSDSHHLKKERHEKE 
WECESERHRRRDRSQDKDRDRKSRKEGHKDKERARLSKGDRGTD 
GKr,SRDSRNVDKK?DKPKSEDYEKl)KEREKSKHREGEKDRDRYil 
KORDHTDRTKSKR 


5*52 


3226 


C3 9 


PPARRSARCLPRALSKEAARPSGSVmGALCRLL\LVTIi\AFLIF 
AS DA CKNVTLHVPS KLDAE KLVGR VNLKECFTAANLI HSS DPDF 
CI LEDGSVYTTWTILLSSEKRSFTILLSNTENOEKKKIFVFLEH 
OTKVLKKRHTKEKVLRRAKRRWAP I PC5MLENSLGPFPLFLQQV 
OSDTAQtfYTI YYS I RGPGVDOEPRNLFYVERDTGNLYCTRPVPR 
ECYESFEIIAFATTPDGYTPELPLPLIIKIEDENDNYPIFTEET 
YTFT 3 FENCRVGTTVGCVCATDKDEPDTMHTRI.KYS1 1GQVPPS 
PTLFSMHPTTGX'lTTTSSQLDRELIDKYOLKl KVQDMDGQYFGL 
QTTS TCI I NI DDVNDHLPTFTRTS YVTS VEENT VD VEI LRVTVE 
DKDLVNTAKWRANYTILKGNENGNFKIVTDAKTNEGVLCVVKPL 
NYEt XQQMI LOl GVVNEAPFSREASFRSAMSTATVTVNVEDODE 
GPECHPPIQTVRMKENAEVGTTSNGYKAYDPETRSSS5IRYXKL 
TDPTGWVTIDENTGS I KVFRSLDREAETIKNG1 YNITVLASDOG 
GRT CTGTLG 1 1 LQDVNDNS P F I PKKT V 1 1 CKPTMSSAE I VAVDP 
DEFIHGPPFDFSLESSTSEVORMWRLKAINDTAARLSYQKDPPF 
GSYWPIWRDRIX3MSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVCIiGKWAILAIIiLGIALFFCILFTLVCGASGTSKQPKVJPDD 
ixAOON^IVSNTEAPGDDKVYSA.>IGFTTGTVGASAOGVCGTVGSG 
1 KNGGQETI EMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDNC 
R YTYSEWHSFTQPRLGEES I RGHTLI KN 


5553 


330 


811 


PLLCNPDPGWYKWVKQESEISKESOEMDARPKLDLGFKEGQTIK 
LCI GNI TNKKGCASKPRTARGGGLSLLPP PPGGKVTI PPPSS /V 
XLPSTNHVTPPSlPKSNHGGSDADILiDLDSPAPVTTPAPTPVS 
V5NDLWGDFSTASSSVPN0AP0PSNWVQF 


55S4 


32 


2330 


P ?PP PPKLANMADLEAVLADVS Y LMAW E KS KATPAARASKRI VL 
PEPS I RSVMQKYLAERNEITFDKI FNOKIGFLLFKTFCLNEINE 
AVP0VKFYEE1KEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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PCT/US«M)/342<»/ 



SEG 
JD 
NO: 

i 


Predicted; 
beginning 
nucleot i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuciec: ide 
local :ien 
ccr respond i ijg 
tc first 
srr.ino acid 
residue cf 
am: no acid 
seccenct 


/..r.mo acid sec^cnt ccntaininy signal peptide 
:?-.= Alanine , OCysteine, D=Aspartic Acid, J ■• 
Cl utomic Acid, ? =Phenyl alanine, G=Glycane, 
r. -^Histicme, 3 = 3 soi euci ne, K=Lysine, 
■--Leucine, MsWethi ornne, N^Asparaci ne , 
."-Proline, Q=G3ft.n»r,inc, R=Arginine, 
i ^Serine, T«Threonine, V=Valine, ] 
V*. Tryptophan, V=Tyrosine, X^Unknown, *=Stcp 
Ccdon, /^possible nucleotide deletion, 1 
\-possible nucleotide insertion) ' 








V T $ KQAVEH VQSHLS KKQV7STLFQPY I EE 3 CESLRCD I FQKFK ; 
fc f DXFTRFCyW KM VK UJ I HLTKNEFSVHR 1 1 GRGGFGE VYGCR K 
^.^TGKMY^KCLNKKPIKMKOGETLAIJJERIMLSLVSTGDCPFJ 
V C KT YAFKT V D KLC F 1 LDLM NGGD LH YH L S QHG V FS E K EMR FY A 
T?.:iLGLEHr»HNRFVVYRPLKFANILLDEHGHARISNDLGLACD 
F f KKKPHA5 VGTHGY^PEVI^KGTAYX>SSADWFSLGCMLFKLL 
RGHSPKRCKKTKDKKF1DRMTLTVNVELFDTFSPELKSLLBGLL 
CKDVSKRLC-CHGGGS0EVKEHSFFKGVDKOHVYLQKYPPPL1PP 
RGEVKAADAFDIGSFDF.EDTKGl KltLDCDOELYKNFPLVI SERVJ 
COEVTETVYEAVNADTDK 1 EAR XRAKNKQLGHEEDYALCXDCl M 
KG V MLK LGK P FLTQVJQRR Y FY L F PNRLE W R GEG E SRGN LLTMEO 
1:,SVEET01KDXKC3LFRIKGGX0FVL0CESDPEFVCKKKELNE 
T > X EAQRLLK RA PKFLNK PR SGT VELP KPS t, CHRNSNGL 


S955 


1726 


44< 


KF.EREFRLf.VCPliRYPSAYESSPGTELRECGLCRSGCE'FADCRR 
F^.N-RQDVL^G W I NLF VLQLTKDPLKTPGRLDHGTRTA F 1 HHR EC 
VK KRCINI WK D VGLFG VLN E I ANS EEEVFE WVKTASGWALALCR • 
Wh.S SLHGSLFPH1>SLR£EDLI AEFAQVTNKSSCCLRVFAWHPHT 
NK ? AV ALLr D S VR VYN AS ST I VPSLXHRLQRNVASLAKKPLSAS ' 
VU.YACQSC; LIWTl ^PTfaSTRPSSGCAQVLSHPGHTPVTSLA | 
»k PS GGRLL5 AS PVDAA 1 R VK DVS TETCVF L P WKRGGG V TN IdM j 
SPr'GSKIIJ».TTPSAVFRVV?EA0MViTCERWFTL5GRCOTGCWSPD . 
GI' K I .LFTV1GE PL I YSLSFPERCGEGKG\ALEVOSQORbWQI CL | 
ROC'YRHOMVRRGLGERLTPWSGTPVGNVWLCL 


1 956 


1705 


13S 


GVGVRGARAh'ATVOEKAAAI.NI^ALHSPAHRPPGFSVAOKPFGA 
TYVWS SI I NTLQTQVEV KKRR HR LKRHNDC FVGS EAVDV I FSHi. 
1 QKKYFGDVD1 PRAKWRVCQALKDYKVFEAVPTKVFGKDKKPT 
Ft : PS S CS L YR FTT I PNCDS QU3 KEN KLY S PAR Y ADAbF KS SD I R 
CAfLEDLW^LSLKPANSPHWISATLSPOVlNEVWOEETIGRU | 
U.bVDbPLbESbLKOOEAVPKIPOPKRQS^VNSSNYLDRGILK j 
AY5nSQEDEWL$AA2DCSEYLPDOMWEISRSFPEOPDRTDLVK j 

el:,fdaigryyssrepllnhlsdvhngiael-lvngkteialeat 
oll lxlldfqnr ee fr r hh y fmavaanps e fxloxes dnrmw k 

RJFf KAIVDKKNLSXGKTDLLVLFI*\MDHQKDVFXIPGT1j\HX1 
VS \ VX \ LMA] ONGRDPNRDAG Y I Y CQRI DQKDYSIWTEKTTKDE 
LLK'LLXTLDEDSXLSAKEKKKXLLGQFYXCHPDl fi ehfgd 


19 SI 


1479 


4 S3 


ELC VA VAMD71.DR WK PKT KRA KRFLEKREPXLNEN 1 KKAML I K ! 
GGKANATVTKVLKDVYA1,KKPYGVLYKXKMITRPFED07'SLEFF 
SKKSDCSLF» v »FGSHNKKRPNNLVIGRMYDYHVl.nMlELGTENFV 
SLXD1 KNSKCFEGTKPMLI FAGDDFDVTEDYRRLKSLLI DFFRG 
PTVSN J RLAGLEYVLHFTAIJOGKI YFRS YXLLLKXSGCRTPR1 E 
LEE^PSLDLVLRRTHLASDDLYKLSMKMPXALKPKXXKNISHD 
Tr GTTYGR 1 KMQXQDLS XLOTR KM \ XGL K XRPAER I TSDHEKKS 
XR1 KXKLMELSQPLLFHCVLLKRI IKHQSIQSFL 


b9S& 


1 


3l3£ 


AA>.l>GMLI,KTPACOAFA?LCVEKLTVYSGPXGSYFGYAVDFH2PD 
ART/vSVLVGAPKANTSOPDI VEGGAVY YCPWPAEGSAOCRQI PF 
DTI NNRKI RVNGTXEPI EFKSNQWFG\ATVXA\HXGXSCGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGCGYCQAGFSLDFYKNGDblVGGPGSFYHQGQVITASVADIIA 
NYSFKDILRKLAGEKOTEVAPASYDDSYIjGYSVAAGEFTGDSOO 
ELVAGI PRGAONFGYVS I INSYDMTFIQNFTGEOMASYFGYTW 
VSDA'NSDGLDDVLVGAPLFMEREFESNPREVGQIYLrLOVSSLI* 
FR DPQI LTGTETFGR FGS AMAHLGDLNQDG YNDI A3 GV P FAGKD 
QRGKVL J YI^NTCJDGLNTXPFPXFCOGVWASHAVPSGFGFTLRGD 1 
SDl DKNDYPDLIVGAFGTGXVAVY?JU?PWrVDAQI>LLKPMI IN j 
1>EK KTCQVPD S MTSAAC FSLR VCAS VTGQS I ANT3 VLMAE VQ1»P 
SLKOKGAIXJITLFLDTWOAHRVFPLVIKRQKSHQCQDFIVY1JID | 
ETEFRDKLSPINISLNYSLDESTFKEGLEVXP1LNYYRENIVSE 
QAH1 LVDCGEDNLCVPDLKLSARPDKHOVIIGDENHLMLI I WAR 
NEGZGAYEAXLFVMIPEEADYVGIERNNXGFRPbSCEYX^ENV-T 
RMWCDLGNPKVSGTNYSbGLRFAVPRLEXTNMSINFDLOIRSS 
NXDKPDSNFVSLQINITAVAOVEIRGVSHPPQIVLPIHNKEPEE 
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WO 01/5331 2 



PCT/USO0/:U26* 



SEQ 
IL< 
NO: 


Ired) ctcci 
beg i nn i no 
r.ucl eotidc 
2 c cat ion 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Preaicrted ena 
nucleot ide 
}ocat i cr. 
correFf ending 
to xir.st 
amine acid 
residue of 
amino acid 
sequence 


Amino acic segment containing signal pepejat - 
(A»Manine, C*Cysteinc, D=Aspartic Acid, h~ 
Glutamic Acic, F --Phenyl alani ne , G=Glycint. , 
K=-Hist idine, 1 = 1 soleucine, K=Lysine, 
L= Leucine, M--*Methi oni ne , N= Asparagine, 
P=?roline, C=Gl\3tamine , R=Arginine, 
S^Serine, ^Threonine, V=Val:.ne, 
Wr Tryptophan, y-Tyrosine, X Unknown, *=Sto;: 
Codon, /^possible nucleotide deletion, 
\*=possibie nucleotide insertion) 








EPHXEEEVGPLVElUyFLHNIGPSTISDTlLSVGWPFSARJDEFi, 
LYIFHIOTLGPLCCOPKP^INPQDIKPAASPEUTPEbSAFLKNS 
Tl PHLVRXR DVH WE FHR QSPAKI LNCTN 2 ECLQI S CAVGRL EG 
GESAVLKVRSRLWAUTFLORKNDPYALASLVSFEVKKMPYTLOP 
AXLPEGS3A1KTSV3VJATPNVSFSIPLWVIILAILLGLLVLA1L 
TLALW K CG F FD RAJ? P PQ E DMTDR E QL1 ND KT P EA 


59SS 


I 


1 1 66 


GTSGYAAOOLPSLLKEREFKLGTLNKVF/iSQHLNHROWCGTKC 
NTLFWDVQTSQ1TK1 PI LKDREPGGVT00GCG1HAIELKPST<T 
LIATGGDNPNSLA1YRLPTLDPVCVGDDGHKDWIFSIAWISDTK 
AV SCS R DGS MG LWE VTDDVLTKS DARHNY S RVPVYAH I TH KALK 
D 1 PKEDTN PDN CKVRA1 JvFNM KN KELGAV SLDGY FKLWKAENTL 
SKI>l,STKLPYCi?Em;CLAYGSEV;S\r/AVGSQAHVSFLDPROPSY 
NVKSVCSRERGSGIKSVSFYEHIITVGTGOGSLLFYDIRAORFL 
EERL5ACYGSKPRLAGENLKLTrG\KGV%LNHDETWRKYFSDlDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLKS 


5960 


28S3 


e?o 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKE | 
LDWAS 3 NGFCEOI >NEDFEGPPLATRLLAP:K I OS PQEVf EA1 QALT 
VLETCMKSCGKR FHDEVGKFR PLNEL I K WS PK YLGSRTS EKVK 
NK1LELLYSWTVGLPEEVKIAEAYOMLKKOG\IVKSDPKLPDDT 
TFPLPPPRPKNVI FEDEEXSKMLARLLKS SllPEDLRAANXL 1 KE 
KVQEDQKRMEK I S KR VNAI EEVNNNVJO. LTEMVWSHSQGGAAAG 
SS EDL\MKEL\YORCEEMKPTLFPTGRVDTEDND\EALAE I LQA 
NDNLTQV2NLY XQLVRG E EVNC-DATAGS 3 PGSTS ALLDLSGL.DL 
PPAGTTYPAMPTRPGEOASPEOPSASVSLLDDELMSLGL.SDPTF 
PSGPSLDGTGWNSFX;SSDATEPPAPALACAPSMESRFPAQTSLF 
A?SG1.DDU)LLGKTLL00SLPPES0QVRKEK00PTPRL.TLRDLC 
N5CSSSCSSP2SSATSLLHTVSPEPPRPPCOPVPTELSLAS1TVP , 
LES I KPSNILPVTVYDOHGFRILFHFARDPLPGRSDVLVWVSM 
LSTAP0P1RN1VFQSAVFXVMKVKL0PPSGTELPAFNPIVHPSA 
ITOVLLLANPOKEKVRLRyKLTFTMGDOTYKEMGDVDQFPPPET 
WGSL 


S961 


198 


314 7 


SGEPRPEPGNiV.TClGEKlEDFKVGNbLGKGSFAGVYRAESIHT 
GLEVAIK/4IDKKAMYXAGMV0RV0NEVK1HC0LKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMMRYLKNRVKFFSENEARHFMHOIIT 
GKLYLHSHGl LKRDLTLSNLLLTRNMH 1K1 ADFCLATQLKMPHE 
KK y TLGGTFNY 1 S P E LATRSAHGLES D VW5. LGCMF YTLL 1GRPP 
FDTDTVKNTLNKVVIADYEMPTFLSJEAKELIHOLLRRNPADRL 
SLSSVLDHPFKSRNSSTKSKDLGTVEDS1ESGHATISTA1TASS 
STS J SGS LFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTOWGNOETSNSGRGRVlonAFERPHSRYLRRAYSSDRSGTSN 
SCSOAKTYTV.E RCH S AEMLS VSKRSGGGF.N EERYS PTDNNAN I F 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 
TVOX)WFGNU?INAHLRKTTEYDSISPNRDFOGHPDLQKI)TSKNA 
WrDTKVKKNEDAEDNAHSVKQONTMKWTAbKSKPEIIQQECVF 
GSDPLSEQSKTRGMSPPKGYQNRTLRSITSPliVAHRLKPlRQKT 
KKAWS I LDSEEVCVELVKEYASOEYVKEVL0I SSDGNTI TI Y Y 
PMGG\RGFPLA\DRPPSPT\DN1SR\YSF\DNLPEKYWRKYQYA 
SRFVQLVRSKSPKITYFTRYAKCILMENSPGADFEVWFYEGVKI 
HKTEDFIOVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
ICLALES I ISEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSNYPTRDRAS FNRM VMHSAAS PTQAP I LftPSMV 1N^aj1/jL»_ i 
TASGTDISSNSLKDCLPKEAQLLKSVFVKNVGKATQ\LTSGAVW 
V0FNDGSQI.WOAGVSSI SYTSPNG0\TTR\YGENEKLPDY1 KO 
KLQCLSSILLMFSNPTPKFH 


5962 


20 


2447 


RVCSSSASTASOAVTvlADAV'EEIRRLAADFORAQFAEATQRliS ER 
NCI EI VNKLI AQKQLEWHTLDGKEY1 TPAQ1 SKEMRDELKVRG 
GRVNI VDLQO VI NVDL I H 3 ENR I GDI I KSE KHVQLVLGQLI DEN 
Y LDRLAEEVNDK LQESGQVTI SEbCKTYDLPGNFLTOALTOR LG 
R 2 1 SGH I DLDNRGV I PTE AFVARBKAR I RGLFSAI TRPTAVNSL 
I SK yGFOEOLLYS^EELVNSGRLRGTWGGRODKA VFVPDI YS 
RTCSTKVDSFFR0NGYLEFDALSRLGIPDAVSYIKKRYKTTOl»l» 
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1 ' SEO" ' 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


>recicted end 
nucleotide 
vocation 
corresponding 
to first 
amino acid 
residue of 
nmino acid 
tecuence 


Anino acid secernent contairiinc signal peptide 
(A=Alar.ine, C^Cyste ane, D=Asparcic Acid, E* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^-Hi stidine, 1 = i soitucine, K=Lysine, 
L=Leucine, M=Meihicnjne, N=Asparagine , 
P=Proline, C=Glut amine, J?=Arg:nine, 
S=5erine, T=Threonir.e, V=Valine, 
w=Tryptophan. Y = Tyro<?ine, X»Unknown, *=Stop 
Codon, ApoeeibJe nucleotide deletion, 
\=poesibJe nucleotide insertion) 








FLKAACVGQGLVDQVEASVEEA5 SSGTWVD2 APLLPTS bSVEDA 
A J LLQQVKRAF S KQA S T W F S DT VWSE K F \ I NDCT E L K H E L MH 
OKAEKHMKNNPVHLlTEEDLK01STLESVr>TSKKDKKDERRRKA 
TEGS GSMRGGGGGWAI* E YK J KXVXXXGRKPDDSDDESQS SHTGX 
KK PE 1 S FMPQDE I EDFLR ia 1 QDAPEEF 1 S EliAE YL1 K P LN KTY 
LEWRSVFMSSTTSA5GTGRKRT1KDL0EHVSNLYNN1KLFEKG 
MKFFADDTOAALTKJ-fLLKSVCTDlTNIiIFNFlASDLMMAVODPA 
AITSEIRKKJLSKLSEETKVALTKLHNSLNEKSIEDFISCIjDSA ! 
AEACD I WVXRGD K KR FRO I LFOKRQAIiAEOI.KVTEDPAL JLHLT 
SVL1.F0F5THSMLHAFGRCVPQI IAFLNSK 1 PEDOHALLVKYQG 
I •W K OIjV ^n<; If K TROT D Y P L r*N F I>D K 7 OE D V A S TTR K F I .O F I .fi Q 
SI XDLVLKSRXSSVTEE | 


S963 

i 

| 


62 


1130 


PKNPQD FPCNRGLMG \QKGEI GPP \GQQG K KG APGMP \ G LMGSN 
GS PGQPGTPGSKGSKGEPG I QGMPGASGLXGEPGATGS PGEPGY 
MGLPGI 0GKKGDKGNOGEKG I QGOKGENGROGI PGQOG 10GHHG 
AKGERGEKGEPGVRGA1GS KGESGVDGLMG PAGPKGQPGDPGPQ 
GP PG1 iDGKPGRE FSEOF I R OVCTDV I RAQLP VbLOSGR I RUCDH 
CLSQHGSPGI PGPPGP3GPEGPRGLPGLPGKDGVPGLVGVPGJIP 
GVRGLXGLPGRNGEKGSOGFGYPGEQGPPGPPGPSGPPG1SKEG 
PPGDPGLPGXDGDHGXPG1 OGQPGPPGI CT PSLCFSV J ARRDPF 
RXGPNY 


5964 

i 

V 

{ 

i 


3 


2147 


SCRTRGRLSPLOPR^ACSSRGSRARSEPPKFGGMEEACQVOTTK 1 
RGDPHELRNIFLOYASTEVDGERYMTPEDFVORYLGLYKUPNSN 
P K I VQ LLAGV ADQT KDG L 1 S Y QE FLAFES V 1 . C A PDS M F I VA FQL 
FDKSGNGEVTFENVXE2FGOTIIHHHIPFNKDCEFIRLHFGHNR 
KKHLNYTEFTOFLOELOLEHARQAFALKDKSKSGMISGLDFSDI 
MVTIRS HMLTPFVE ENLVS AAGGS I S HQVS F£ YFNAFNS LLNNM 

r*i itni/i vfTi hi^tti vrrrtrrvprrjiA^movr^hTnt nisti vr/~\ 
ELV.N.KJ rSTLAGTR.KEA£,V J XXEFAQSAI H l G£>A3 l*YO 

LADLYNASGRLTLADI ERIAPLAEGALPYNU--EL0ROOSPG1.gr 

PIWLCIAESAYRFTLGSVAGAVGATAVYP1LLVKTRM0NORGSG 

SWGELMYKNSFDCFKKVLRYEGFFGLYRGLIPOLIGVAPEKAI 

XiTVKDFVRDKFTRRDGSVPLPAEVLAGGCAGGSOVIFTNPLEI 

VK1RLQVAGEITTGPRVSAUJVLRDLGIFGLYKGAXACFLRDIP 

FS A I Y F P VY AHCX LLLADENGK VGGLNLLAAG AMAG \ VPAAS LV 

TPADV1KTRLQVAARAGQTTY£GVIDCFRXIL\REEGPSAFWKG 

TAARVFRS5PQFG\VTLVTYELLQRGFYIDFGGLXPAGSEPTPK 

SR1ADLPPANPDH1GGYRLATATFAGIENXFGLYLPXFX5PSVA 

W0PKAAVAATO 


596S 


1 


149B 


KVT^YRFLPTSNWAAXLRSLLPPDLRLOFWLHJVRLQKCFl.SRG 
CGS Y CAG AKAS P LPG K W AMGLMCGRR ELLR LL0SGR R VH S V AG P 
S0WLGKPL.TTRLLFPAA PCCCR PHYLFLAASGPRSLSTSAI S FA 
IP/OVOAPPWAATPSPTAVPEVASGETADWOTAAEOSFAELGL 
GS YTFVGLI QNLLEFMHVDLGLPWWGA1 AACTVFARCL1 FPL1 V 
TGOREAAKlllNHLPElOK^SSRIREAKliAGDHlEYYXASSENiAL 
YQXKHG1 KLYKPL1LPVTOAP J FI SFFlALRENANLPVPSbOTG 
GLW WFODLTVSDP I Y I L PLA VTATMWAVIiELG AETGV0SS DLQW 
MRNVIRMMPLI TLP J TMH FPTAVFM YWLSSNLFSLV0VSCLR3 P 
AVRTVLKIPQRWHDLDXLPPREGFLESFKKGWKNAEMTROLRE 
REORMRKQLELAAPGPLROTFTHNPLLQPGKDNPPNIPSS\SSS 
SSKPKSKYPWHDTLG j 


5966 


2 02 


1925 


RSKQ VMARLTKRRQADT KAI QHLWAAIE 1 1 RNO KC3 AN I DR I TK 
YMSRVHGMHPXETTROLSLAVKDGLI VETLTVGCKGS KAG I EQK 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
£ DE FRLRDS SSPWQCPVCRS 1 KKKNTNKQEMGTYLRF1 VS RMKJE 
RA I DLMKKGKDNKHPMYP.RLVKS AVDVPT1QE KVNEGKYPJS YEE 
FXADAQLLLHNTVI FYGADSEOADI ARMLYKCTCHEL\ DELQLC 
XNCF YLAKARPDNWFCY PC I FNHELDWAKMKGFGFWP AKVMQKE 
DN0VDWFFGHHH0RAMIPSEN10DITVN3HRLHVKRSMGWKKA 
CDELELHORFLREGRFWXSKNEDRGBEEAES SI SSTSNEObKVT 
0EPRAXKGRRN0SVEPXKEEPEPETEAVSSS0E1PTMPOPIEKV 
SVSTQTKXLSASS PRMLKRST0TTNDGVCQSMCHDKYTX1 FNDF 
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SEC 
ID 
NO; 


Predict «-< 

nucleotic' 
iocatior 
<. or respc:.r:;ng 
to first 
£mino ac:<j 
residue of 
ana no acjc 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptide 
{A=Alanme, C=Cyet€ir,e, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycint. 
H=Hi£tidme, 1-3 sol tucine, X=Lysine, 
L^Leucine, K=Methicni ne, N=Asparaa:nf. , 
P=ProJine, Q=Glut arcane , R=Argimnc, 
S=Serine, T=Threcm ne , V=Valirie, 
W=Tryptophan, Y*Tyror:ne, X=Unkncvs'n, *=Stop 
Codon, /=possibie nucleotide deletion, 
\epossible nucleotide insertion) 




i 


KDRMKSDHKR ETERWR £ ALE K LRSEKE E EKRQAVN KA VANMCG 
EMDRKCKOVKHKCKEFr VEE Z KXLATQHKGU SCTXKKQWCYNC 
EEEAKYHCCKNTSYCS1 XCCCEHWHAEHKRTCRRXR 


5967 




1925 
1^88 


RS KOVMAKLTKRROADTKA 1 QKLWAAI E 1 1 RNCKQI AN I DR1 TK 
YMSRVHGMHPKETTROLSl^.VKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGPE3DWETENHDWYCI-ECHLPGEVLICDLCFRVYHSKCL 
SDEFRLKDSSSPWQCPVCRS 1 KKKNTNKQEMGTYLRFI VSRMKE 
RAIDuWKGKDNKHPMYRRLVKSAVDVPTIOEtCVNEGKYRSYEE 
FKADAQLLLHNTV1FYGADSE0ADIARMLYKDTCHEL\DEL0LC 
KNCFYLANAR PDNWFC YPC1 F KHELDWAKMKGFG FWP AKVMQKE 
DNQVD7RFFGHHH0RAVJ 1 PSEK 1 QD1TVM 1 HRLHVXR SMGW KKA 
CDELELH0RFLREGRFWXSKNEDRGEEEAESS1SSTENE0LKVT 
QEPRAKKGFRNQSVEPKKFEFEPETEAVSSSQEIPTMPQPIEKV 
SVST0TKKLSASSPRMLHR5TCTTNEX?VC0SMC}IDKYTKIFNDF 
KDRMKSDHKR ETERWR EAIE KLRS EMEE E KRQA VNKA VANMQG 
EMDRKCX0VKEKCXEEFVEE1KKXAT0HK0LISQTKKK0WCYNC 
EE E AM YHCCKTfTS YCS I KCQOEi I WHA EH KRTCR R KR 


596B 


61 


VRFPRRGGAPPTVLTPGROOGVFU5PQRPGSEPDIPARGQPHPP 
R PVG V STR AC AQVQP P AKHR R R LALGLG FCLLAC T C LG V LW V Y L 
ENWLPVSYVPYYI,PCPE1FIJKiK],HYKREKPL0PWWSCYP0PKL 
LEHR PTQLLT LT P W LA P I V S E G7 FNP ELLQH 1 Y Q ? LNLT 1 G VTV 
FAVG N / H FLFS AE E FFKRG Y RVftYYJ FTDN PAAVPGVPLGPHR L 
LSSIPlQGRSI-WEETSMRKKrTISOHIAKRAKREVDYLFCLDVD 
MVFRNPWGPETLGDLVAA3KP5YYAVPROOFPYERRRVSTAr/A 
DS EGDFY YGG AVFGGQ VAR V Y E FTRGCHMA I LAD KANG I F4AAWR 
ESSHLNRHFISNKPSKVLSPEYLWDDRKPOPPSLKLIRFSTLDK 
DISCLRS 


5969 


112( 


533 


DVGFKI KRKRCDLDVF LESPRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTS0KVKEAGRDFTYLIWLFG1SITGGLFYT1 
FXSLFSSSSPSK1YGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRROHVR FTE YV KDGLKHTCV K F YI EGS E PGKOGT V Y AOVKEN P 
GSGEYDFRYI FVE3ESYPRRTI 1 1 EDNRSQDD 


5970 


316 


4712 


SODNIGHRLLC'KHGWKLGOGLGKSLQGRTDPIPI WKYDVMGMG 
RMEMb'LDYAEDATERRRVLEVEKEDTEELROKYKDYVDKEKAIA 
KALEDLRANFYCELCDKOYCKHQEFDNHINSYDHAHKORbKDLK 
OR E ?AR>"VS S R SRKDE X KOE KAi R R LHELAEQRXC AE CAPGSGP 
MFKPTTVAVT)EEGGEDDKDESATNSGTGATASOGIiGSEFSTDKG 
G P FTAVO I TN TTGLAOAPGLASQG I SFG 1 KNNl £TPLQ K LGVS F 
SFAKXAFVKLESIASVFKDKAEEGTSEDGTXPDEKSSDOGLCKV 
GDSDGSSNLDGKKEDEDPODGGSLASTLSXi>KRMKREEGAGATE 
FEY YHYIPPA.MCKVKPNFPFLLFMRASEOMDGDNTTHPKN APES 
KKGSSPXPXSCIKAAASOGAEKTVSEVSEOPKETSMTEPSEPGS 
KAEAKKALGGDVSDOS LESHS OKVS ETOMCESNSSKETS LATPA 
GKESOEGFKKPTGPFFPVI.SKDESTALQWPSELLIFTKAEPSIS 
YSCNPLYFDFXLSRNKDARTKGTEKPXDIGSSSKDHL0GLDP6E 
PNKSKEVGCEKIVRSSGGRMDAPASGSACSGLNXQEPGGSHGSE 
TEDTGRSLPSXKERSGKSHRHKKKKKHKKSSKHKRKHKADrEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKFEPPGSGSPA 
PPRRR RRAQDDSQRRSLPAEEGS SGKKDEGGGG S E SQDK GG R KH 
KGELPPSSCQRRAGTKRSSRSSMRSQPSSGDEDSDDASSHRLHQ 
KSPSOYSEEEEEEDSGSEHSRSRGRSGRRKSSHRSSRRSYSSSS 

dassdoscysrorsysddsysdysdrsrrkskrshdsddsdyas 
skhrskrhkysssdddyslscscsrsrsrsktrersrsrgrsrs 
sscsrsrskrrsrsttahsworsrsysrdrsrstrspsorsgsr 
xrswghespeerhsgrrdfi rsk i yrsqsphyfrsgrgegpgkk 
ddsrg dds katg ppsqnsn igtgrgsegdcspedknsvta klll 
ekiqs rkverkpsvs eevqatf nkagpklkdppqgy fgpklp ps 
lgnkpvlp lig k lp atrkpn k kce esgl3rgebqeq s eteeg p ? 

GSSDALFGHOFP\SEETTGPLLDPPPEESKSGEVTADHPVAPLG 
PPAHFDCYLGDPTISHNYLPDPSDGNTLESLDSSSOPGFVESSL 
LPIAPDLEHFPSYAPPSGDPSI ESTDGAEDA\SLAPLESQP3TP 
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SEO 
ID 
NO: 


Predicted 
beg: nn:ng 
nucleotide 
locat J or. 
corresponding 
to firs: 
an\>no ocic 
residue of 
amino acid 
seqruencc 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amjno acia segment containing sicnal peptide 
(A=Alanane , C^Cystei.ne , D^Aspartic Acid, E= 
Glutamic Acid, F=» Phenyl al ar. i ae , G=Glycine, 
H=Hi st ic-ine, I = Iaoieucme , K^Lysinfc, 
I_,-Levcine, Me t hi on i ne , N=Asparagine , 
?=Proline, C^Glutamine , R^Argimne , 
S=Serine, T=Threonine, VsValint, 
W=Tryptopnan, Y=Tyrosine, X^Unkncwn, +=Stop 
Codcn, /spossible nucleotide deletion, 
V-pcssible nucleotide insertion) 








TrEEMEKY SKL00AAQQK J QQQLLAKQVKAF PASAALAPATPAb " 
0 P I H 3 00PATASA7S I TTVQHAI LQlttlAAAAAAA IG.THPHPHPQ 
PXxAOVKHIFOPHLTPISLSHLTHSIIPRHPATFLASHPIHIIPA 
SPJHPGPFTFHPVPHAALYPTLLAPRPAAAAATALKLHPLLHPI 
FSG3DLQHPPSHGT 


5971 


53 


2149 


SFLYFVG VDMDNPI GNWDGRFDGVQLCSFACVESTI L»Z>H I KDI I 
PESVTC'ERRPPKLAFMSRGVGDKGSSSENKPKATGSTSDPGNRN 
RSP.LFYTLUGSSVDSOPOSKSKNTWYlDEVAEDFAXSbTElSTD 
FDR3SPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAOSVMEELNTAPVQESPPLAMPPGNSHGLEVGSLAEVKENP 
PFYGV IRW1 GOP PGLNEVUVGLELEDEC AG \ CTDGTF /R EGTR Y 
FTCALKKALFVKLXSCRPDSRFASLQPVSNQ1ERCNSLA3WEAY 

lse^/veentptqkwekegleimigXkkxgiqghynscyldstlf 
clfafssvldtvllrpkekkdveyyset0ellrteivnplr1yg 
vvcatk3mklrk3lekveaasgftseekppeeflnilfhh1lrv 

EPLLXXRSAGQKVODCYFYQIFWEKNEKVGVPTIOOLLEWSFIN 
SNL K FAEAP S CLI I QM P R FGKD F K LFK K 3 F PS LELN I TDLLE DT 
PRQCR1 CGGLAMYECRECYDDPD J SAGK1 KQFCKTCNTQVHLHP 
KRIjNH KYNP VS J ^PKDLPDWDWRHGCI PC0NME LFAVLC I ETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYL 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTOSPTMSliYK 


5972 


440 


1761 


Ibl^GSPSPRDOCSOROSSGGDKELVTRGCTFSTAVVSPSAriTO 
EPFREEI^YDRMPTLERGRQDPASYAPPAKPSDLQLSKRLPPCF 
SHKTWVFSVXjMGSCLLVTSGFSLYLiGMVFPAEMDYLRCAAGSCJ 
PSAIVSFTVSRRNANVIPNFQ1LFVSTFAVTTTCL1VCFGCKLVI. 
NPS A IN I N FNL3 LLLI ,LE LLMAAT V 1 1 AAR SSEEDCKK KKG SMS 
U?AN J LDEVPFPARVLKSYSWEVI AG2SAVLGGI 1ALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVBVLIAISSL 
TSPI>LFTASGYt>SFSIMRIVEMFKDY?PAl KPSYDVLLLbLLLV 
LLLOA / GPQKGHRH PVRAJjQGQCKAAGCI LGH P ER PAGAPG WGG 
GOEPPEGVROGESLESRRGANGPVTPRRGKRVAAPSLAPGMETH 

m 


5973 


6? 


• 2007 


MGDGKDLFGH 3 WAWRSNGI I SNFRRSPHAGMAEDEPDAKSPKTG 
GRAPPGGAEAGEPTTLL0RLRGT1 SKAVONKVEGI LQDVQKFSD 
NDKLYLYLOLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHIiBE 
HTDTCLPKQSVYDAYRKYCESLACCRPIiSTANFGKIIREIFPDI 
KARR LGG RG0£ K YC YSG J R R KTLVS MP ? LPGLD LKGSES P E MG P 
E VTPAPR DELVEAACALTCDWAER 3 LKRSFSS 3 VEVARFLLQQH 
L I SARS AHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
K KPERLAOPP KDLEARTGAG PLARG ERKKS WES S APGANNLQV 
NALVARLPLLLPRAPRSLI PPI PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVPIINMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTEMREVG IGGDQGPHDKGVKRTAEVPVS EASGCAPPAXAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRU 
PWErWGSGGEG^SAGGAERPGPMGEAEKGAVl J AOG\CGDG , TVSX 
GGR3PGS QHTKEAEDK IPL VPS KVS V J KGS RSQKEA FPLA KGEV 
DTAPOGNKCLKTHVLQSSLSQEHKDPKATPP 


5574 


4293 


2200 


LG I iQMHTTSGR IKQAMVTSLNEDNESVTVEWI ENGDTKGK\EID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV \AS I KNDPP S \RDNRWGSARARPSQFPEOFSS AQQNGS V\S 
DI S? VQAAKXE FGPPSRRKSNCVKEVEKLQEKREKRRLQOOELR 

VCVXKRPLNKK£TQMKPLDV3TI PSKDWMVHEPKQKVpLTRYI* 
ENOTFKFD YAFDDS APNEKV Y R FTAR PLVET I FERGMATCFA YG 
DTGSGXTHTMGGDFSGXN^CSXGIYAIAARPVFLMLKKPNYKK 
LELQV YATFFE I YSGKVFDLLNRKTKU? VLFPGKQQVQVVGLQE 
REVKCVEDVLK1.IDIGNSCRTSG0TSANAKSSRSHAVF0IILRR 
KGKLHGKFSLI DIAGHERGADTS SADRQTRLEGAEINKSLbALK 
ECI RALGRXKPHTPFRASKLTQVLRDSF3 GEflSRTCMI ATI SPG 
MASCENT^^TLRYANRVKSLTVDPTAAGDVRPI^IKHPPNOI\DD 
LETQWGVGSSPORDDLKLlXEQNEEEVSPOIiFTFHEAVSQMVEM 
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SEC 
I D 

MO: 


Predicted 
begi nni nc 
nucleotide 
j ocat lcr. 
corresponding 
to firct 
amino acid 
residue of 
amino acid 
sequence 


Freda ctcd end 
nucleot idfr 

ocat ion 
corretpcndinc 
to firat 
amine acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pept xde 
( A~.A1 anifie , C-Gysteine, D-Aiipart ic Acid/ E- 
Glutzmic Acid, F^Phenylnl tin nc , G^Glyc_ii\e, 
H=Hict idine, 1 = 1 aoleuci nc , X=Lysinfc, 
L«Lcucine, M = Me;thioni.ne, N^Aspcre9ine, 
?=Proline, O-Glutamine, R=Arginine, 
S=5erine, T= Threonine, V=Valine, 
W=Tryptophan , Y=7yrosine, X= Unknown , *?=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








EEQWEDHRAVFQES I RWLEDEKALLtKTEEVDyD^-DSyATOLE - 
AI DECK 1 D I LTE LK DKVKS FRA.\LC E E EOAS K0 1 N P KR PRAL 


S97S 


4 2 93 


220C 


LGL.0KHTTSGR1 HOAMVTSLNEDNEE VTVEW1ENGDTKGK\ El D 
LESIFSLNP\DL\VPD3EIEPSP\ETPPPPASSAKV74KIVKNRR 
7V\ASI KNDPPS \RDNRWGSARARPS0FPE0FSSAOQNGSV\S 
DlSPVQAAKKEFGPPSRRKSNCVKEVEKXQEKREKRRljQQQELR 
EKRACDVDATNFNYEI MCMJRDFRGSLDYR PLTTADP I DEHR I C 

EHQTFR FDYAFDDSAPNEMVYR FTAR PLVETI FERGMATCFAYG 
QTG 5 G KTH TMGG D FS G KNCDCS KG I Y A L AA R DV FLML X K.PN Y X X 
I,EL»QVYATFFEiySGKVFr»M.NRKTKLRVLEDGKQOVO\A/GI,0E 
REVKCVEDVL.KLI DIGNSCRTSGQTS A^AHSSRSKAVFQI ILRR 
KGKLHGKFSLIDLAGNERGADTSSADR0TRLEGAE1NKSLLALK 
ECI RALGRNKPHTPFRASKLTQVLRDS F1GENSRTCM I ATI S PG 
MAS CENTLWTLRyANRVKELTVDPTAAGDVRPI MHHPPNQI \DD 
LErOWSVGSSPORDDbKLLCEONEEEVSPOLFTFHEAVSOMVEM 
EEQWEDHRAVFOESIRWLEDEKALLEMTEEVDyDVDSYAlXJLE 
AI 1>EQK I D : LTELR DKVKSFRAALOEE EQAS KO i N P KR PRAL 


5S76 


2D 


291S 


VHH1,HLTRVSW^LDIILR1AWMG1KTLNLVLG\LKRA\LEF 
PEVSWMEVKDPMMKGA/1LTNTGKYAIPTIDA\EAYA1GKKEKPP 
FLPEEPSSSSEEDDPJPDELLCLICKnaMTDAWIPCCGNSYCD 
ECI RTALLESDEHTCPTCHQNDVS PDA.L1 ANKFLRQAVNNFKNE 
TGYTKRI.RKQLPSPPPPIPPPRPLIORNLOPLHRSPISROODPL 
MIPVTSSSTHPAPSJSSbTSNQSSIAPPVSGNPSSAPAPVPDJT 
AT VS I S VKS E XS DC P FRDSDNK I LPAAAIASEHS KGTS S I AI TA 
LMEEKGYOVPVLGTPSLLGOSLLHGOL.IPTTGPVRINTARPGGG 
RPGWEHSNKI.GYI,VSPPQQIRRGERSCYRS1MRGRHHSERSQRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGVOTAHSNTIPTTQ 
APPLSREEFYREQRKLXEEEKKXSXLDEFTNDFAXELMEYXKIQ 
XERRRS FSRS KS PY£>GSS roKo £> x 1 rSK5l<£>Oc> IKoRSiSKSFS 
RSHSRS YSRSPPYPRRGRGKSPJNYRSRSRSHGYHRSRSRSPPYR 
RYHSRSRSPQAFRGQSPNKRNVPOGETEREYFNRYREVPPPYDM 
KAYYGRSVDFRDPFEKERYREWERKYKEWYEKYYXGYAAGAQPR 
FSAKRENF55PERFI.PLNIRNSPFTRGRREDYVGGOSHRSRNIGS 
NYPEKLSARDGHWOKDNTKSK^KESEWAPGDGKGNKHKKKRKRR 
KGEESEGFLNPELLETSRKSREPTGVEENXTDSLFVLPSRDDAT 
PVRDEPMDAESITFKSVSEKDKRERDXPKAKGDKTKRKKPGSAV 
SXXENIVXPAXGPQEKVDG\DVRDLLDLN1i\0IjKKPKEETPKDL 
TILNHHLPLRRMKKSL\EPP\EKLTLNOOX\TPRNKTSQRGKSE 
EGLPORCQIRKANN 


5977 


1363 


133 6 


FLEDRGQVLSKFOCLSLHSINH1LHPGAGVAAGPATGW/REYLT 
PVLKES KFKETGV 1 TPEEFVAAGDHLVHHCPTWQWATGEELKVK 
A YLPTGKQFLVTKJJ VPCYKRCKQMEYSDELEAI I EEDDGDGG WV 
DTYHNTG I TGI TEAVKE I TLENKDNI R LQDCSAbCE EE EDEDEG 
E AADMEE YEESGLLETDEATLDTRKI VEACKAXTDAGG EDAILQ 
TRrYDLYlTYDXYYOTPRLWliFGYDEOROPLTVEHMYEDISQDH 
VXKTVTI ENHPHLPPPPMCSVHPCRHAEVUKXI I ETVAEGGGEL 
G VHM YLL I FLKFVQAVI PTI E YDYTR H FTM 


5978 


160 


321 3 


RDG APR WGGCCS PLTWAPG F Y R R FDLATSGRRLRGQTAEPAGRQ 
RPRREPEAHDEQSVES1AEVFRCFICMEKLRDARLCPHC3KLCC 
FSCIRRWLTEORAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
Z>CSXiTKHEEWEXDXCEN^EKl,SVFCWTCKKCIC7i0CALWGGMK 
GGHTFKPLAE 1 YEOH VTKVNE EVAKLR R RLMELI SLVOSVERNV 
EAVRNAKDERVRE 1RNAVEMM I ARLDTQLXNKLITLMGQKTSLT 
CETELLESbLQE V EHOLRSCS XS ELI S KSS E I LMMFQQVHRKPM 
AS FVTTPVP PDFTSELVPS YDS ATFVLEN FSTLRORADP VYSPP 
LQVSGLCWRLKVY PDGNGVVRGYY LS VFbELSAGLPETS KYEYR 
VEMVHOS CND PTKNI I R EFASPFEVGECWG YNR F FRLDUANBG 
YLNPONDTVI LRFQ VRS PTFFQKSRDOHH Y I TQLEAAQTS YIOO 
I NNLKERLTI ELSKTQKSRDLSPPDNHbS PONDDALETRAKKSA 
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Prediciec j 
beginning | 
nucleotide 
location 1 
corresponding ■ 
to first 
amino arid 
residue of 
amino acid 
sequence 


> ; edict ea end 
r \:cleot ict 
1 ocetion 
c c rrespcndmc 
ic first 

2 no ncic 
residue of 
c»i:no acic 
st guence 


Amino acid segment containing signal peptide 
(A=A)anine, C=Cy&teine, D-A&partic Acid, 
Glutamic Acid, F-- Phenyl a i anine , G-Glycinc , 
JHHistidine, I = I f-oi eucinr , K=Lysine, 
L*Leucinc, M»Me thic;ni ne , N=Asparagine , 
P-Prcline, Q=Glutamine, R^Arginine, j 
S = Serine, T=Threo.mne , V=Valine, j 
HrTryptophan, Y=Tyrosine, X=Unknown, *=Stcp j 
Codon, /=possible nucleotide deletion, ] 
Vpossible nucleoside insertion) | 








CSDMLLERXGPYSAS \VREAKEDEEDEEX1 QNEDYHKELSDGDIj 
DLDLVYE DE VKQL.DG S S G DAS STAT SNTEEND1 DEETMSG EN DV 
EYrJNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTKLCSA 
A/TSSLLDIDPLIljIKLl^LKDRSSlENLKGLQPRPPASLbQPTA 
S YS R KDK DQR KQQAMWR V PS DLKMLK.RLKTQMASVR CMKTDVKN 
TLS E I KSSSAASGDMQ7S1 >FSADOAALAACGTENSGRLQDLGME 
LIAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCOTLSEGSPGSSOSGSRHSSPRALIHGSIGDILPKTE 
DROCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 
0MTDLEMNSETGELQPVLPEGASAAPSEGMSSDSD1ECDTEXEE 
CEEHTSVG6FHDSFMVM70FPDED7HSSFPDGE0IGPEDLSFNT 
EENSGR 


597* 


232 


3665 


lpdmtk y lwl kllafg fafldtevfvtgqs ptpsptdaylnase 
tttlspsgsavistttiattpskptcdekyanitvdylynkstk 
lft a klnvn envecjgnntctnn ev knltec kn as v sis h nscta 
pdktl1 ldvppgvekvpvhccsxoveqpdstiwlkwknietstc 
dtqnitykfocgnmifdkkeiklenlepeheykcdseilynshk 
ftnaskiiktdfgspgepqiifcrseaahcgvitwnpfqrsfhn 
ftlcyi ketekdclnldknli kydlonlkfytkyvlslha y i t a 
kvqrngsaamchfttksappsqvwnm'pvsmtsdnsmhvkcrppr 
drngphehyhleveagntu'rneshkncdfrvkdlovstdytfk 
ayfhngdypgeffilhkstsynskaliafi7vpliivts1allw 
lyk1 y dlhkkr scnldeooelverddexqlmnvepi kadi llet 
ykrkiadegrlfuaefos:prvfskfpikearkpfnonknryvd 
i lp ydynrvelse i ngdagsn yinas y i dgfkeprky i aaqgpr 
detvddrwrmj weqkatv i vmvtrce egivrnkcaeywpsmeegt 
rafg eccckdltkh kr c p \dy 1 1 qkln i vnk kekatgre vth 1 0 
ftswpdhgvpedphlllklrrtrvnafsnffsgpi whcsagvgr 

ivrvTrrnhMr rvT Oi\ irw v\rr\\TVf > 'V\f\7VX DDr.On JH\/0\7PZ»AV' r 
ToTY-IGI DAMIj£GIjr.Ac,IVjv VUv ZU i v VRijKHvKi»wnvuvf./*v * ■•■ 

L3 HQALVE YNOFGETEVNLS SLHPYLUNMKKRDP PS EPS PLE AE 

FQRLPSYRSWRTQHI G*C/E\ ENKSKNRNSNV3 PYDYNRVFLKHE 

LEMSKESEHDSDESSDDDSDSEEPSKY1NASFIMSYWKP\EVM3 

A.AQGPLKET1GDFWOMIF0RKVKV1WLTELKHGDOEICAQYMG 

EGKQTYG^lEV^LKDTTiKSSTYTLRVFELRKSKRKDSRTVYQYO 

YTNWSVEOLPAEPKELISMI0WK0KLP0KNSSEGNKKHKSTPI. 

L1HCRDGSQQTGI KCALLNLLESAETEEVVDI FQWKALRKARP 

GMVSTFEQYQFLYDV1ASTYPA0NGQVKKNMHOEDKIEFDNEVD 

KVKQDAN C VN P LGAPEK LP FA KEQAECS EPTSGTEG PEHSVNC P 

ASPALNQGS 


59*(. 


3 


2363 


DAWGCKLRjRLRFTYGTOTKVSLALPGQYELVHTLVAHOGNWETI 
PEEDLEVQ2NNEDAAHDL7ELEVTK}IKALLQEVDWVAPC0GI.R 
P TVDVLGDLVNPFLPV I TYALHXDELS ERDEQELOE IRKYFSFP 
V FFFKVP KLGSE 1 1 DS ST R RMESER S P L Y R QLI DLGY LS SSHWN 
CGAPGQDT:<AO£MbVEQSEKLRHLSTFSHOVLQTRI,VI)AAKALN 
LVHCHCLDIFINQAFD^RDLQITPKRLEYTRKKENELYESL^ 
2 AKRKQEEMKDM1 VET LNTMK EELLE D ATNME FKD V I V P ENGE P 
VGTREIKCCIRQ1QELI 3 SRLNQAVANKLI SS VDYLRESFVGTL 
ER CLQSLEKSQDVS VH I TSNY LKQI LNAAYH VEVTFHSGSSVTR 
MLWEQI KOI IQR ITWVS PFAI TLEKKRKVA0EAIE5LSASKLAX 
SICSOFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRLAR L»S LESRS LQDVLLHRKP K LG Q E LG3GQ YG VVY LCDN 
WGGHFPCAI.KSVVPPD5KHWNDLALEFKYMRSLPKHERLVDLHG 
SVIDYNYGGGSS1AVLLIMFRLHRDLYTGLKAGLTLETRLQ1AL 
DWEG1 RFLHSQGLVHPJD J KLKNVLLDKQNRAX1TDLOFCKPEA 
MMSGS3VGTPIHKAPELFTGKYDNSVDVYAFGILFWY3CSGSVK 
LPEAFERCAS KDHLWNNVRRG AR PERLP VFDEECWQLME ACWCG 
DPLKRPLLG I VQPMIiQG I MI^RLCKS\KS EQ PNRGLDDST 


5S0I 


1 


2515 


GRkKSAJtMEKPVK3AADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGA?PRGGRWRRSAP 
G \EDEECGR VRDFVAKLANNTHQHVFDDLRG S VSLSWVGDS TXJV 
ILVLTTPKVPLVJKTFGOSKLyRSEDYGKNFKDITDLINNTFIR 
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NC: 


Pi edicterf 
beginning 
nucleotide 
~\ ocation 
corresponding 
to first 
amino acid 
residue of 
emino acid 
sequence 


Predicted era 
tiuc Jeot :de 
iocatior. 
corresponding 
to firs: 
air.ino r.cid 
residue of 
ammo ac;d 
sequenct- 


Am:r,o acid segment containing signal peptide 
(A^Alanine, C=Cysteinc, D=Aspartic Acid, E- 
Glutanic Acid, r-Phenylr lanine, G=Glycine, 
K»Histidinc, I =-Isoleucme, K=Lysine, 
^.-Leucine, MrMethioni ne , N^Asparagine , 
PsPxoline, Q-Giutamine, RsArginine, 
S=Serine, T=Threcnine, V=Valine, 
W* Tryptophan, Y*Tyrosine, X=Unknovn, *^Stop 
Cccion, /=possib)e nucleotide deletion, 
V=pcssible nucieotide insertion) 








TEPGHAIGPENSGKVVLTAEVSGGSRGGRI FRSSDFAKKFVQTD ~~ 
LP FHPLTQMMYS PON? D Y LLALSTENGLWVSKN FGGK»EE IHKA 
VCLAK WGS DNT I FFTT YANG S CXADLG ALE LWRTS DLG KS F KT I 
GVKIYSFGI^RFLFASVMADKDTTRRJHVSTDCJGDTWSKAOLP 
S VG OEOFYS I LAAN DD W FMHVDEPGDTGFGTI FTS DDRG I VYS 

DOGC-R WTHI.RK PEN SECDATAKN K3>IECSLH I HASYS I SOKLNV P 
MAP LS E PNAVG 1 VI AHGS VGDAI S VM VPDVY ISDDGG YS WTKML 
EGP1JY YTI IOSGGI I VAI EHSSRPJ NVI KFSTDEGQCWQTYTFT 
RDP 3 Y FTGLASEPGAR SMN1 S I WG FTES FLTSQWVS YT I DFKD I 
LERNCE EKDYTIWUAKSTCPEDYEDGC 1LGYKEQFLR LR KSS VC 
QNG3DYVVTKQPSICLCSLEDPLCDFGYYRPENDSKCVEQPELK 
GHDbEFCLYGREEHLTTNGYRKIPGDKCOGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNS VP1 1 LAI VGLMLVTWAGVLI VXKYVC 
GGR FLVKLY S VX,QOH\AEA\NGVDGVDALDTASHTN KSGYHDDS 
DEDLLE 


5982 




23 It 


ATRPPRGSSWCRQFSRTASAAPGRSNMLRIPVRKALVGLSKSPK 
GCVRTTATAASNLI EVFVDGOSVMVEPGTTVLOACEKVGMOI PR 
FC YHERLSVAGNCRMCLVE1 EKAP KVVAACAMPVMKGWNI LTKS 
E KS K KAREG VME FL1 »ANH PLDCP I CDQGGECDLODQSMMFGNDR 
£ R FI .FGXRAVEDKN 1 GP LVKTI MTRCI QCTRC I RFAS E J AG VDD 
1/GTTGRGNDMQVGTY 1 E KMFMSELSGNI 1 DI CPVGALTSKPYAF 
TAR P W E TR KTES I D VMD A VG SN I V VS TRTG E VMR I L PR MH ED I N 
FEW I S DKTR FAYDG LKR OR LTE PM VR ME KGLLT YTS W EDALS R V 
AGMLQS FOG XDVAA 1 AGGLVDAE ALVALKDLLNR VDS DTLCTE E 
VFPTAGAGTDLRSNYLLNrriAGVEEADWLLVGTNPRFEAPLF 
NAR I R K S WLHNDLKVAL I GS PVDLT YT Y DH LGDSPK1 LQD I ASG 
SHPFSQVLKEAXKPKWLGSSALQRNDGAAILAAVSS3A0KIRM 
T5GVTGDWKVMNILHR1AS0VAALDLGYKPGVEAIRKNPPKVX.F 
LLGADGGC1TRQDLPKDC? IIYCGKHGDVGAPIADVILPGAAYT 
EKSAT/VNTEGRAOQTKVAVTPPGl^AREDKKI I RALSEIAGMTL 
PYDTiA DQVRNR LEE VS PNL VR YDD 1 EG\AN YFQQANELS KLVN 
QCLLADPLV PP0LTMKDFYMTDS1 SRASC/TMAKCVKAVTEGAQA 
VEEPSIC 


S983 


248 


1761- 


EARG DGG R R RHRA SG R RAG R G E P \ h G L K SQG QRAV PKRAVAR GG 

DCNRAL3LHPFSMKPLl,RRAKAYETLEQYGKAYVDYKTVLOIDC 
GLQLANDS VNRLS RI LMELDGPNWREKLSL1 PAVPASVPLQAWH 
PAKEKI S KQAGDSSSHnOOGI TDEKTFXALKEEGNQCVNDKNYK 
DA r, K v S ECL K T IVTMFCF C A I YT *IRA T »r v I . K r jCO FE EAKODCDOAL 
QLADGHVmFYRJtALAFKGLKNYOKSLIDLNKVlLLDPSl I EAK 
MELEEVTRLLNLKDKTAPFNKSKEKRKIEIQEVNBGKEEPGRPA 
G EVS TGCLASEKGGKS S R S PEDPEKLP I AKPNNAYEFGOI 1 NAL 
S TRKDKEACAHLLA1 TA PKDLPMFLSNKLEGDTFLLLI OSLKNN 
LI EKCPS LVYQHLLY LSKAER FKMMLTLI S KGQKELI EOLFEDL 
SDTPNNH FTLED I QALKRQYEL 


5984 


755 


1193 


SSVCMACTYVSN1/5KK0RSVSFLASGLMRVSTGPELRLHHSFVL 
TGDVGRR I CRLLVGLFTKGDTSSKR VHPFS PGPCFLLCULAR VG 
SSPK1^SPFY0N\QTST0RSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRS ISRFSSG 


5985 


22 


140fr 


RRVARPGTAEPAKARRTVRRGRARRDLAGAERKAGVSERGDSGR 
RRPNPSIPSAAAGMSHaQlPPGLTFXLC^YTVEVLROQPPDLVE 
FAVEYFTRLREARAPASVLPAATPROSLGHPPPEPGPDRVADAK 
GDS ES EEDEDLE V PVPSR FWR RVS VCAET YNPDEEEEDTDPR V 1 
HPKTDEOKCRLOEACKDI LLFKNLDOEQLSQVLDAMFER I VKAD 
EHVI D0GPDGDNFYV1 ERGTYDI LVTKDNQTRSVGQYDNRGSFG 
El^LMYNTPRAATIVATSEGSLWGLDRVTFRRIIVKNNAXKRKW 
FESFI ES VPLLKS LE VSERMK I VDVI GBKI YKR/DGER 1 1 TOG E 
K \ ADS FY 1 1 2SGEVS IL3RS RTKSN KDGGNQEVE LARCH KGQYP 
GE1ALVTNKPRAASAY AVGDVKCLVI^DVOAFERLLGPCM D IKKR 
K I SHY EEQliVKWFGSS VDLGNLGQ 
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SHQ 
ID 

NO: 


Predicted 
beginniric 
nucleot j ce 
location 
ccr respendi ng 
to first 
amino acid 
residue cf 
an-.ino acid 
sequence 


PredicihC end 
nuciecr : r.e 
1 oca t3c: 
corr fcf.T cnci ng 
to firs> 
onrinc r-t jc, 
residut- cf 
amino £C3o 
sequence 


Ammo acio segment ceni fining sicnal peptide j 
(A=Alaninc- , C-Cysteine , D=Aspartic Acid, E = 
G]uLsmic Acid, F- Fhenv J alanine , G=Glycine, 
K = tfistidirje, 1 = 1 soleuc: ne, KsLysir.e, 
L=Lr-\:cine, M^Met hion^nt , K=Aspar«c i ne , 
P = Frclinc, 0=G]uCotr.ine . R=Arginine. i 
S = Senne, T= Threonine, V = valine, j 
W^Tryptophan, Y=Tyrosirje, X=Unkncvn, *=Stop 
Codon, /-possible nucleotide deletion, i 
Vpossible nucleotide insertion) 


S986 


1806 




DAW KSTS LTF KWK LWGKHRGR K S G I AH PKNULS PQQGG Al PQV P ' 
S PCCRFDS PR GPP PPRLGLLGA L ilAEDGVRGS PPV PSG PPMEEF 
GLRW7PKSPbUPDSGljbSCTl,PKGFGGQSG?EGERSLAPPDASl 1 
LZ5NVCS IGDKVAQEI »FQ3SD LG KAEEAER PGEK \AGQKS PLRE | 
EHVTCV0S2LDEFLQT\YGSL1 PIS TDE VVEKLED I FQC E FST F | 
SRXGLVLQL1 CSYQRMPGNAMVKGFRVAYKJUIVLTMDDLGTLYC j 
QN K i j M DQ VMN V, Y G DLVMDT V F E K \VHFFNSFFY\DKLRTKGYDG 
VKRKTKNVDI FNKELLHPIHLEV--JWSLISVDV7?KRTI'rYFDS0 ! 
RTLKRRCPKK ~ AK Y LQ AE A VK K D F. I ..D FHOG W KG Y F KNNV A RQNN . 
DSDCGAFVLCYCKHLALSQPFSF70QDMPKLRRQ3 YKELCHCKL j 
TV 


S587 


1806 


4b4 


DAWKSTSLTFHWKLWGRHRGRKRGLAHFKNHLSPC'0<5GATP<3VP j 
SPCCKFDSPRGFPFPRLGLLGALKAEDGVRGSPPVPSGPPtfEED | 
GLRWTPKSPLDPDSGbLSCTLFNGFGGQSGPEGERSLAPPDASl 
L3 SNVCS3GDH VAOELFQGSDLGrAEEAERPGEK\AGOHSPLRE 1 
EHVrCVQSILDEFL0T\YGSLIPL£TDEWEKLEDlFQOEFSTP j 
SRKGLVLOLlpSYORMPGNAMVhGFRVAYKRHVLTKDDLGTLYG 1 
CNWLKDOVKN K YG DLVMDTVP EK \ VH FFNS FF Y \D KLRTK G YDG 
VKRWT KNVD 1 F'N X EI ,LL I PI HLE VH W S LI S VDVR RJVT1T Y FDSO 
RTLN KRCPKH ~ AK YXQAEAVKKJ>F. LDFHQGWKGYFKMNVA RQNN 
DSDCGAFVLQV ciOLLAl.SOPFSF'J OQDMPKLRRO^ YKELCHCKL j 
TV i 


S988 


1292 




FKi<.V F LS FLGL.LF. SSH SRDR 1 IBn 1 : • ArLMFLLATKNl -WJWFTCR FC 
RLDCJ YLNAG3 ^PNPOLNIKAM.FGLFSVAEGLbTOGDKITADG 
LOEVFETDVFC-HFILIRELFPIACHSDNPSOLIWTSSRNARKSN ; 
FSLECFOHSKGKFJPYSSSKYATDLLSVAIjNRNFNOOGLYSNVAC ' 
PGTA LI'H I /TYG I LPPFI WTLLKPM LLLR F F AN A ?TLT p Y NGTE , 
ALVRL Yri OKP E .S LN P L I KYLSA7 V 7 CFGRNY1MTQKMDLDEDTAE 
KFYC K 1 .LELEKK 1 RVTI QKTDNQAm LSGSCL 


£989 


194 


261 i 


AMEFPQHSOHVLEOLNOORQl^LlCDCl-FVVDGVHFKAHKAVLA ; 
ACSE V FKMLFVDOKDVVHLDI S N AAGLGQ VLfc'FM Y TAKL£ LS PE | 
NVDDVLAAVATFLOM0DI ITACHAoKSLAEPATSPGGNAEALAT i 
EGGtKRAKEEKVATSTLSRLEOAGRSTPIGPSRDLKEERGGQAQ | 
SAAS(;AEOTEK^DAPREPPPVELKPDPTSGMAAAEAEAAbSESS j 
EQEy.EVFPARKGEEECKEOEEQKEEGAGPAEVKEEGSOLENGEA 
PEENHKEESAGTDSGOELGSEAPGLKSGTYGDRTESKAYGSVIH 
KCEDCGKEFTKTGNFKKHIRIHTGF.KPFSCRECSKAFSDFAACK 1 
AHEKT):SPLKPyoCEECGXSYRL:?.LLNLRKKRHSGEARYRCED 

cgklfttsgnlkrholvhsgekpyccdycgrsfsdptskmrhle 
tkdtokehkcphcdkxfnovgni.kahlkihiatx5plkcrecgkq 

FTTSGNLJCRHLK JHSGEKPYVCIHCgRQFADPGALQRHVRI HTG 
EXPCOCVMCGKAFTOASSLIAHVRQHTGEKPYVCFRCGKRFVOS 
SQ1»AK T H 1 R HHDN 1 R PH KCS VCS KA FVNVGDLS KK 1 1 IHTGEKPY 
LCDKCGRGFNR\a}NLRSnVKTVHOGKAGIKILEPEEGSEVSWT 
VDDMVTl^TEAiJ^^TAVTQLTVVPVGAAVTADETE\ r LKAE I SKA i 
VXOVOEEDPNTKIbYACDSCGDK.Fl.DANSLAOHVFI HTAOALVM j 
FQTDAEF YCX?y G PGGTWP AGOVLG AG ELV FR PRDG AEGQPALAE 
TSPTAPECPPPAE | 


5990 


2 


4700 


FGPG PDSGGGARGSG WGSRSQAPYGTLGAVSGGECVLLHEEAGD 
SGFVSI^RLGPS LRDKDLEMEELMbODETLLGTMOS YMDASL1 S 
L1EDFGSW5EVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGOcPPPQOHSDGEEEEEVASFSGOILAGELDNCVSSIPDFP 
MHLACPEEEDKATAAEMAVPAAGDES 1 SSLSELVRAMHPYCLPN 
LTHLAS LEDELOE OPDDL>'i*bPEGC WLEI VGQAATAGDDLE I PV 
WRQVS PG PR PVLLDDSLETSSALC LLMPTLESETEAAVpX\TTL 
CSEKFGLSLNSEEKLDSACLLKPREWEPWPKEPONPPANAAP 
GSQRAR KGRKKKSKEQPAACVEG YARRLRSSSRGOSTVGTEVTS ; 
QVDNLOKQPQEELOKESGPLQGKGK P RAWXRA W AAALEN SSPKN j 
LERS AGCSS P AKEGPLDLY PKLADT I QTNP I PTHL S LVDS AQAS j 
PMPVDS VEADP TA VG P VLAGP VP VDPGLVELASTS $ ELVE PLPA 
EPVLI K P VLADS AAVDPA VVPI SDN LP PVDAVPSG P APVDLALV j 
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BNSDOCID: <WO 01&3312A1„L> 



WO 01/53312 



PCT/lJSUO/34 263 



SEQ 
ID 

NO: 


Predictec 
beg inn inc. 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieot ide 
1 oca t ion 
ccrrer.ponding 
to first 
amino acid 
res: cue of 
amino acid 
sequence 


Amino acid segment containing signal pepti~ce 
lA^Alanine , C-Cysteine, D^Aspartic Acid, E - 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidinc, 1= j soleucine , K=Lysinc, 
b- Leucine , tf=Met. hionir.e , N=Asparagine, 
P^Prol^ne, O^Glutamine , R = Arcinine / 
S = Serine, 7- Threonine, Vs Valine, 
W=Tryptophan, Y^ Tyrosine, X-Unknown, *-Stcr. 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDFVXVKSRPTDPRRGAVSSAliGGSAPOLLVESES 
bDPPXTIIPEVKEWPSLKIESGTSATTHEARPRPLSbSEYKKR 
R0OR0AETE k: K S PC P P'J'G KW PSLPET PTGLADI PCLV 1 P ? A PAX 
KTALQRS PETPbE I CLV P VG PS PAS PSPEPPVS KPVASS PTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCyPKVSPSGYP 
CLPPPPTVPLVSGTPGAYAVFPTCSVPWAPPPAPVSPYSSTCTY 
GPbGWGPGPOKAPFKSTVPPPFLPPASIGRAVPOPKMESRGTPA 
GPPENVLPLSflAPPLSLGLPGHGAPOTEPTKVEVKPVPASPKPK 
KKVSALVOSPOMKALACVSAEGVTVEEPASERLKPE'TOETRPPE 
KPPLPATKAVPTPRCS TVPKLFAVHPARLRKLSFLPTPRTOGSE 
DWQAFI SEI G I EASDLSSbbEQFEKSEAKKECPPPAPADSLAV 
GN SGG VD 1 PQ E K >i PL D R LQAPELANVAGbT P PATPPHQLWK PLA 
AVSLLAKAKSPKSTAOEGTLKPEGVTEAKIIPAAVRLQEGVHGFS 
R VHVGSG DKDY C \VR S R TP P KK\ MPAL.LI PEVGSRWKVKRHQD 1 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEOADPSAPCLAPS 
SLLSPEASPCRNDMNTRTPPEPSAKORSKRCYRXACRSASPSSO 
GWQCJR^GRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSFP 
HKRWR3SSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRR^SDRRRRYSSyPSHDHYQRORVLCKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGEI EECTIHFRVQCDNYGFVTYRYAEE 
AFAAIESGHKbROArEOPFDbCFGGRROFCKRSYSDLDSNREDF 
DPAPVKSKFDSLQFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSOCS PS I YC\TKFDKGGNVTS FERKK.TELYQEbGL.QAR 
DLRFQHVWS1 TVRNNRI IMRMEYLKAVITPECbLILDYRNLNLK 
OWLFRELPSObSGEGOLVTYPLPFEFRATEALLQYWINTLOGKL 
SILQPLI LETLDALGDPKHSSVDRSKLHILLQNGKSLSELETD3 
. K1FKESILEI LDEEELLEEIiCVSKWSDPQVFEKSSAGlDHAEEM 
ELLLEKY YRLADDLSKAARELRVLIDDSQS 1 1 FINLDSHRNVMK 
RLhJLQija'NGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLI TGI 
MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGG3VEGL 


5992 


2 


609 


AGPDFRUVCGVSGSGFPGGRQGCATENRPLRPWNGAMEKLRRVL 
SGQDDSEQGbTAQDSQI Nb/ SEVLDAS SbS FNTRLKVJFA1 CFVC 
GVFFSI LGTGLLWLPGG 3 K1FAVFYTLGNLAA1ASTCFLMGPVK 
QLKKMFEATRLLAT1VMLLCF1FTLCAALWKHKKGLAVLFCIU) 
FbSMTHYSLSYIPYARDAVIKCCSSLLS 


5993 


1650 


594 


AEGLGSWAVWAGI^WAGKHMEAGGATGALGVGCKJLPSAFCFPGS 
SVAMDMFgKVEKIGEGTYGWYKAKNRETGObVALKKlRLDLEM 
EGV PSTA1 RE 1 SLLKELXHPN 1 VRLLDVVHNBRKLYLV FEFbSC 
DLKICYHDSTPGSELPbHLIKSYbFQbLOGVSFCHSHRVIHRDbK 
PQNLLlNELGAl XLAEFGLARAFGVPLRTYTHEVVTLWYRAPE3 
LIATRFriTAVDI WS I GC 3 FAEMVTR KALFPGDS \EI DQ\ L FR I 
FRMLGTPSEDTWPGV7 QLPDY KGSFPKWTRKGLEE1 VPNLE PEG 
RDbLMOliLOYDPSORITAKTALAHPYFSSPEPSPAAROYVLORF 
RH 


5994 


394 


1934 


AGEVQhn VW3 RGMR 1 QFQJ KAAA 1 1 DbDPDFEPQS RPRSCT WPL 
PRPEIANQPSKPPEVEPDLGEXVHTEGRSEPILLPSRLPEPAGG 
PQPGILGAVTGPRKGGSKRNAWGNOSYAELISQAIESAPEKRLT 
LA01 YEWMVRTV PY FKDKGDSNSSAGWKNS 1 RHNLSLHSXF 1 KV 
HNEATGKSSWWMLNPEC-GKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGUPAPPEGATPTSPVGHFAKMSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRLSPLRPESEVLAES1PASVSSYAGGVPPT 
bNEGbEbUX5LNbTSSHSLLSRSGl>SGFSbQHPGVTGPLHTYSS 
SLFS PAEG PbSAGEGCFSSSQALEALLTS DTPP P PADVbMTOVD 
PI LSQAPTTLLLGGLPE SSXLATGVGLCPKPLE APGPSSLV PTL 
SMJAPPPVMASAPI PKAX/jTPVLTPPTEAASQDRMPQDbDLDMY 
MEKLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAVIbCTRARGSAAFVPPLPRPPSRGARRRRRbPGR 
GVAAbJRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEWMEELHSbX DP\ RRQEbLEARF\TGLGVSKGPLNS ESSNQSb 
CSVGSbSDKEVETPEKKONDORNRKRKAEPYETSQGKGTPRGHK 
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BNSDOCID <WO__0153312A1J_> 



WO 01/533 J 2 



PCT/DSOO/34263 



SEC 
ID 

NO: 


Predicted 
beginning 
nucieot ice 
location 
corresponding 
to first 
ar.no acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
1 ocation 
cor reepondino 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia c • oment containing sianal peptiue 
lA=*ianine, OCysteine, D*Aspartic Acid, F.* 
Glutamic Ac.c, F=Phenyl alanine , G-Glycinc, 
H=Hictidinc. I=-Isoleucine, K=Lysine, 
L=L-cucdne, f * • Methionine, N=Asparagine , 
F=Picline, =Giutamine , R=Aigmmc, 
S=Serine, Threonine, V=va2int. 
K=Tryptopha:. . Y=Tyrosine, X=Unknown, *=Stop 
Ccdon, /=posvible nucleotide deletion, 
^possible nucleotide insertion? 








ISDVfKRRVEOP: VGLDGSAAKEATEEQSALPTLHSVMLAKPRL 
DTE0LA0RGAGL C PTFVSAQQNSPSSTGSGNTEHSCSSOKCI SI 

OHRQTNOSDLT- r KI salens knsdlekkegriddllrancdlr 

ROI \DE0QKK»LE i.YK\ERLNRCFDNEPRMFLl EKSKQEKyiACRD 
KSMODRLRLGKF. TVRHGASFTE0WTCGYAF0ML3KQQER1NSQ 
REE 3 ERQRJCMLAKRK PPAMGQAP P ATNEQKQR KS KTNGAENE TL 
TLAEyHE0EEIKr:.RLGHLKKEEAEIOAELERLERVRNLHIREl> 
KK1KNEDNS0FKUHPTLNDRYLLUILLGRGGFSEVYKAFDLTE0 
RyvAVKIHOLNK v :WRDEKKENYH)GIACREYRlHKEIiDHPRIVKL 
YDYFSLDTDSFCTVLEYCEGITOLDFYLKQHKLMSEKEARS21MQ 
I VNALK YLNE1 K ! ■] JHYDLKPGNI LLVNG7ACGE1 KITDFGLS 
KIMDDDSYNSVDoKELTSOGAGTYWYbPPECFWGKEPPXISNK 
VDVWSVGVIFYCCLyGRKPFGHNOSOQDILOFNTILKATEVCFP 
PKPWTPEAKAF1 KRCLAYRKEDRIDV00LACDPYLLPH1 RKSV 
STSSPAGAAI A?7 SGASNNSSSN 


5996 


1612 


981 


DCOACLLGLMLT^rFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFSJWFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVL 
AFLTCLLYLAJjDVVFPOISSVKDRKXVAVLSGHPVVSGEPKPAA 
FWArLWFTGDSCYI \ANQWQVSKPXDNPLNEGTDASPGRPSPFS 
FFS1 FTWSLTAAL./-VRRFKDLSFQEEYSTLFPVASAQP 


5997 


161 ^ 


981 


DQQAC1 XGLMLT; FFGI LEFDPS W 1 GS WTQR / S WVSWRSR FGCE 
L FS 1 WFGS I VN F C- Y LNS AS EG EE FC I YNRNPN ACS YGVAVGVL 
AFLTCIXYLALDVyFPQISSVKDRKK\AVLSGHPVVSGEPHPAA 
FWAFLWFTGDSCY i \ANQWOVSKPKDNPLNEGTDASPGRPSPFS 
FFS I r TWSLTAAL ^.VRRFKDLS FQEE YSTLFP \ ASAQP 


5998 


161? 


981 


DOOACbLGLMLTLEFGlLEFDPSWiGSWJ'UR/SWVSWRSRPGCE 
LFSI WFGSI VHEC -YLNSASEGEEFC1 YURNPH ACS YGVAVGVL 
AFLTCLLYLAI>DVYFPQISSVKDRKK\AVLSGHPVVSGEPHPAA 
PWAFLWFTGDSCY1\ANQW0VSKPKDNPLNEGTDASPGRPSPFS 
FFS I FTWSLTAAL A\ra R FKDLSFQE E Y STLFP \ ASAQP 


5999 


2 


1790 


rppmekarrggdgvprgpvlhivwgfhhkkgcovefsypplip 
gixhdshtlpeew'kylpflalpegahnyoedtvffhlpprngng 
atvfg i scyr \ 0 1 -\ akalkvrqad 2 tr etvqk svcvlskl pl yg 
lltfaklqll tray f ee kdfsqis i lkelyehmn s s lggas legs 
o^'lglsprdlvlhfrkkglilfklillekkvlfyispvnklvg 
almtvlslfpgmlihglsdcsqyrprksmsedggloesnpcadd 
f vsa s tadvshtk ; ,o t i rkvmagnhg edaamkte eplfqvedss 
kg0epndtnqylk7psrpspdssesdwetldpsvledpnlkere 
0lgsp0tnlfpkd5 vpseslpitvqpqantgqvvli pgljsgle 
eoqygkplai ftkg ylclpymalqokhllsdvtvrgfvagatni 
lfrookhlsda j vf veealioi hdpelrkllnpttadlrfadyl 
vrhv^enk uov via )gtgweggdewi raqfavy i kallaatlqlv 
lfr1 vnvakk1 gnv mvtt \ s rnwotg k \avgqs vgg afs \ s ak 
taNmsswlstfttstscslteppdekp 


6000 


101 


1561 


TEPCR TAENCTATK5 ErWKNSLESSLROLKCHFTWNLMEGENSL 
DDFED KVFYRTEFC >IREFKATMCNLLAYLKHLKGQNE AALECLR 
KAEEL 3 0OEHAD0A 1 1 RSLVTWGNYA W VY YHMG R LS DVQI YVDK 
VXHVCEKFSSPYK : ?:SPELDCEEGWTRLKCGGN0NERAKVCFEK 
ALEKKPKNPEFTSGIAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLH K MR EEGEEEGEGEK \I*VEEALEKAPG\ VTDV 
LRSAA\KFYRGKDr?OKAIELLKKALEYIP\NNAYLHCQIGCCY 
RAKVFOVMNLREMCKYGKRKLLELIGHAVAHLKKADEANCNLFR 
VCSILASLHAI^QYEDAEYYFQKEFSKELTPVAKOLLHLRYGN 
F0LY0MKCEDKA1HKFIEGVKINQKSREKEKMKDKLQKIAKWRL 
SKNGADSEALHVLA FLQELNEKMOQADEDSERGLESGSLI PSAS 
SWNGE 


6001 


176 


1038 


AFAHS PSRGHRKTK 1 HTPRHTPRCTMAESHL0SSLXTASQFFE1 
WLHFDADGSGYLEG KELQ^IQELQQAJ^KKAGLELSPEMKTFVD 

oygcrxidgkigivflj^vlpteenflllfrcoolksce\efmkt 
wrkydtdhsgfi et e elknflkdllekanxtvddtklaeytdlm | 
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F NSDOC 10: < WO 0 1 533 1 2A 1 „ L > 



WO 01/5331? 



PCT/USOO/34263 



SEC 
ID 
NO: 


Predict ec 
beginning 
huoj. reticle 
locat ion 
corresponding 
to lirst 
amino acid 
residue of 
amino acid 
sequence* 


Predicted enc 
nucleotide 
location 
correspond) ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


At.:. no acid st-pment containing signal" peptide 
(A-Aianine, C=Cysteme, D=Asp*rtic Acid, E- 
Glutamic Acid, F=Phenylalaninc , G<=Ciycine, 
K.-Kistidine , I = lsoleucinC/ K= Lysine, 
L-Leuc;ne, M=- Methionine, N«Asp«ragine , 
P=?rclane, OGlutamine, R=Arginine, 
S = £erdne, T=Threonine, VsValir.e, 
W= Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Coccn, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 






LKiiFDSNWDGKI^LTEMARLLFVQENFLLKFQCSIKKCGKEFNKA 

fe: YDCDGKGY 1 deneldallkdlceknkqdld inni ttykkni 

KALSDGGKLYRTDLALILCAGDN 


6002 


311 


81 


lappggglmpprtplshsrpppshhaphpsplplppadlhphs 
smaorsdlleudcoltrdrwwshdenlcrosglnrdvgsldf 
edl plyke kle vy fs pghfakgsdrrmvrle ^lfqr fprtpms v 
eikgkneelireq/vlvrrydrneitiwasekssvhkxckaanp 

EHPLSFTl SRGFWVL.liSYYLGLLPFIPlPEK?FFCTbPNI INRT 
YFPrSCSCUJQLLAWSKWLIMRKSLlRHLEERGVOWFWCLNE 
ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTSGNPAJSSARKPGSAGGPKVGAGASKEGCAGAVDEDDFIKA 
KJ 'DVPSlQJYSSRt LEETLNK I RE I LSDDKHD WDQRANAZ,KK I R 
SLLVAGAAOYDCFFOHLRLLDGALKLSAKDLRSQVVREACITVA 
HLSTVLGNKFDKGAEAIVPTLFNLVPNSAKVMATSGCAAIRFJI 
RKTHVPRLlPLlTSNC.SKSVPVRRRSFEFLDLbLCEVJQTHSLE 
RHAAVLVET3KKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 
KSbEPSYOKSLiQTyLKSSGSVASLPOSDRSSSSSOESLNRPFSS 
K W S TANPS T VAG R VS AGS S KAE S bPGSLQR £ R S D I D VN AAAG AK 
AHHAAGOS VR SGRUG AG ALNAG S Y ASLEDTE D KLDGT AS EDGRV 
RAKl.SAPUAGMGNAKADSRGRERTKMVSQSOPGSRSGSPGRVLT 
TTALSTVSSGVQRVI-VNSASAOKRSKIPRSOGCeREASPSRLSV 
ARSSRIPRPSVSCGCSREASRESSRDTSPVRSFQP1ASRHHSRS 
TGALYAPEVYGASGPGYGISOSSRLSSSVSAMRVbNTGSDVEEA 
VADALbLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSS RNG SI Y TY MR(?r \ ED V\ AE VLNRCASSN WS ER KEG L LGLQN 
blK^QRTLSRVELKRbCE 1 FTRMFADPHGKRY FSMFbETbVDFI 
OVHKI^Dl^DWLFVLLTOLLKKMGADLLGSVOAKVOKAI.DVTRES 
FPNDLQFNI LMRFTVDQTQTPSLKVKVAILKYI ETIjAKQMDPGD 
FIN.SSETRL-AVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTM LLG ALP KTFODG ATKLLH WLRNTGNGTO S SMG S P LTR PTP 
RSFANWSSPLTSPTNTSONTLSPSAFDYDTENMNSEDIYSSLRG 
VTEA10NFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
ggdatdssc>taiA DN KASLLHSMPTH SS PRSRDYH ? W YSDS 1 S 
PFNKS AJLKE AKFDDDADOFPDDLSLDHSDLVA E LL KF.I .SNHNER 
VEEKKIALYELMKLTOEESFSVWDEHFKTILLLLLETbGDKEPT 
I RAJLALKVLRE I LRHQPARFKN YAELTVMKTLE AHKDPHKEWR 
S AEF AASV\ LiATS I V S PEQCI KVLC P 3 IQTADY P I NLAA1 KMQT 
KVlERVSKETLKLLLPEIMPGtlQGYDKSESSVHKAC^FCLVAV 
HAV2 GDELKPHLSOLTGSKMKLLNbYl KRAOTGSGGADPTTDVS 

cos 


6004 


14C 


4098 


GKLKAFRGMRRL1CKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTSGN PANS AH KPG S AGGP K VG AGA S K EGG AGA VDE DD F I KA 
FTDVPS1 01 YSSR ELEETLNKIREILSDDKHDWDQRANALKKIR 
SLL VAG AAQ Y DC F F0H LRLLDG AL KLS AKDLRSOWR EAC 1 TVA 
HLS TVLGN K FDHG AE A I VP TLFNLV PNS AKVKATSG CAA I R F 1 1 
RHTHVPRU PLITSNCTSKSVPVRRRS FEFLDLLLC/EWCTHSLE 
RHAAVLVETI KKG IHDADAEARVEARKTYMGLRNHFPGSAETLY 
NSbEPSYOKSLQTYLKSSGSVASLPQSDRSSSSSOESbNRPFSS 
KKSTANP STVAGRVS AGSSKASSLPGSLQRS R S DI DVNAAAGAK 
AH HAAGQS VR SGR LG AG ALNAGS Y AS LEDTS DKLLDGTAS EDGR V 
RAKbSAPLAGMGWAKADSRGRSRTKMVSOSO?GSRSGSPGRVLT 
TTALSTVSSGV0RVLVNSASA0KRSKI PRSOGCSREAS PSRLSV 
ARSSRIPR PS VSQGCS REASRESSRDTSPVRS FQPLASRHHSRS 
TGALY APEVYGASGPG YGI SQSSRLSSSVSAMR VLNTGS DVEBA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGSlPTYMROT\EDV\AEVLNRCASSm)SERXEGbLGLON 
LLKNORTbSRVEUKRl.CEIFTRMFADPHGKRVFSMFLETLVDFI 
QWKJ3DIX3DWLFVI^TQLLKKMGAD^ 

FPNDl ,QFNI LMR FTVDQTQTPS LKVKVAI LKYI ETLAKQMDPGD 
FIWSSETRLAVSRVlTVnTEPKSSDVRKAAgSVLISLFELKTPE 
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BNS0OCID. < WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO; 


Predicted 
r.uc3 eot ide 

JLCC&tlOfi 

cc r r e s pond i ng. 
to first 
smino acic 
residue ol 
amino acid 
sequence 


Predicted ^nd 
nucleot i dt 
locatior. 
corresponds ng 
to first 
amino acic: 
residue oi 
amino ocic 
sequence 


/•.niiRC acid segment containing signal peptide 
(A-TUo.nine, t = Cysteine, T;=Aspartic Acid, E^ 
Glutamic Acic, FrPhenylr-.lanine, G^Glycine, 
H=H\st idine , 1 = Isoleucir.e , K*Lyeir.e, 
I.= Leucine, K-Methicni r.e , N=Asparagine , 
P-Proline, Q^Cl utaminc , R-Arginine, 
S^Serine, T=7hreonine, V=Valine, 
W=Tryptophan, Y=Tyios^ne, X=Un.kncvm, '-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








HS P ANWS SPLTS P TT3TS0NTLS PS AFDVDTENMNS ED I Y SSLRG 
VTEAlCNFS?RSCEDMNEPLKRDSKKDDGt)SMCGGPG\MSDPRA 
GGDATOSSQTAlADNKASLLKSMPTHSSPRSSDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDKSDLVAELLKELSNHNER 
VEERKIAJjYELMKLTOEESFSWDEHFKTILLLLLETLGDKEPT 
] RAJL\LKV1,REI LKHQPARFKNYASLTVMKTLEA11KDPHKEVVR 
SAEEAASV\LATS1\SPE0C3 KVLCPIIQTADYPINLAAIKKQT 
KVI ERVSKETLMLLLPEIMPGLIOGYDNSESSVRKACVFCLVAV 
HAV1GDELKPHLS0LTGSKKKLLNLYIKRAQTGSGGAI)PTTDVS 
GQS 


6005 


133 




RSSGRRCEQLGQFfGRERKGMASGLGSPSPCSAGSEEEDMDALL 
NNSLPPFHPENEEDPEEDLSETKTPKXKKKKKPKKPRDPKIPKS 
KRQKKEPMLLC3QLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 
GKKKXKXLGPKKEKKSXSKRKEEEEEDDDDDDDSKEPKSSAQLL 
EDWGMEDlDtfVFSEEDYRTLTNYKAFSQFVRPLIAAKNPXIAVS 
KMNMVLGA KWRE FS TNN PFKGSSGAS V AAAAAAAV AW ESMVT A 
TEVAPPPPPVEVP3K KAKTKEG KG PNARRKPKGS PR VPPAKKPK 
PKKVAPLKIKLGGFGSKRKRSSSEDDDLDVESDFDDASIN5Y5V 
SPGSTSRSSRSRKKLRTrKKKKKGEEEVTAVDGYETDHODYCEV 
C00GGEJ I LCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 
CWEAKEDNSEGEEII.EEVGGDLEFEDDHKMKFCRVCKDGGELLC 
CDTCPSSYHIHCLN-PPLPEIPKGEWLCPRCTCPALKGKVOKILI 
KKWGQP PSPT PVPR PPDADPNTFS P K PLEGR PER OFF V KttQGMS 
YWHCSWVSELQ1>F.LHC\QVMFRNY0RKNDMDEPPSGPFGGDEEK 
S\RKRKNKDPKFAEKEERFYRYGIKPEW\MMIHR1LNHSVDKKG 
IWllYLIKWRDLPYDOASWESEDVEIODYDLFKQSYWNHREbMRG 
EEGRPGKKLKKVKLr<KLERPPETPTVDPTVKYERQPEYLDATGG 
TLHPYQMEG1.NWL.RFSWA0GTDT3JLADEKGLGKTVOTAVFLYSL 
YKEGHSKGPFLVSAPLSTIIN\WEREFEMWAPDMYV\VTYVGDK 
D£ RA 1 1 R EN EFS \ FEDNAI RGG KKA SRMKKEAS VKF-H VLLTS YE 
LI T 3 DMA I LGS I D WACLI VDEAKR L KNNQSK FFR VLNG YSLOHK 
LLLTGTPLCNNLEEI.FHLLNFLTPERFHNLEGFLEEFADIAXED 
01 KKJLHDMLG\PHMLRRLKADVrK.NMPSKTELIV\RVELSPM\0 
KKYYK\YILHSKFLKAIjN\ARGGGNOVSbLfr/VMDLKKCCNHPY 
LF PVAAMEAPKMPNGMYDG SAL I RASGfClALI^KMLKNLKEGGH 
RVL1FSOMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAOQFCFLLETRAGGLGINLATADTVI 1 YDSDWNPHNDI 0 
AFSRAHR I GQNKJCVMI YRFVl'RASVEER I TOVAKKKMWLTHLW 
RPGLGSKTGSMSKOELDDILKFGTEELFKDEATDGGGDNKEGED 
SSVIHYDDKAIERLLDRNODErEDTELOGMNEYLSSFKVAQYVV 
REEEMGEEEEVER E 1 1 KOEESVDPDYWEKJLLRHHYEOQQEDLAR 
NLGKGKR I RKQVNYNDGS0EDRDWODDOSDNOSDY SVASEEGDE 
DFDER SE APRR PSR KGLRNDKDKPLPPLLAKVGGN 1 EVLGFNAR 
OR KAFLNA3 MR YGK PPODAFITOWLVR DLRGKSEKEFKAYVSI»F 

MDin r-PW^li.TVlJlPTPttTV^VDCSr'^t .CPOHUl.TBIRVMSLlRKICVO 
H i Jvij xj\^ sit runiA3/u- a r r\l^/\3y r KZ.oVjorvS*'* "wi^-'v * ri<*uin«^» \f 

EFERVNGRWSMPELAEVEENKKxMSQPGSPSPKTPTPSTPGDTQP 
NTPAPVPPAEDGI K J EENSLKEEES I EG EKEVKS TAP ETA I ECT 
QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 
AADVEKVEEKSAIDLTPIWEPKEEKKEEEEKKEVMLQNGETPK 
DLNDEKQKXNIKQR FMFNIADGGFTELHSLWONEERAATVTKXT 
YEIWHRRHDYW1>LAGIINKGYARWGDI0NDPRYA3LN5:PFKGEM 
NRGNFLEI KNKFLAK R FKLLEQALV1 EEQLRRAAYLNMSEDPSH 
PSMALNTRFAEV^CLAESHOHLSKESMAGNKPANAVLHKVLKOL 
EELLSDNKADVTRLPATI ARI PPVAVRLOMSERN I LSRLANRAP 
EPTPQOVAQ0Q 


6006 


1 


" "96S 


Dtf DFLRNTVHRHEPFVTAEP1 RLLAENEDWWDKPSS IPVHPC 
GRFRHKTVIFILGKEHQLKELHPLHRLDRLTSGVIjMFAJCTAAVS 
ER I hecvrdrqleke Y VCRVEGEFPTEEVTCKEP I LWS YKVG v 
CR VDPRGKPCETVFQ R LS YKGQS S WRCRPLTGRTHO I R VHLQF 
LGHPI LNDP I YNSVA WGPSRGRGG Yj PKTKEELLRDLVAEHQAK 
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SEO 
2D 
HO: 


Predicted 
bt c:nni nc 
r.ucl eotice 
iccat ion 
ccr responding 
to first 
jrrjno acic 
ter-jdue of 
atmno acid 
seruence 


Predicted one 
nuclec;; : de 
locatac:. 
correr-r ending 
to firi- 
amine at :d 
residue of 
amino ?cjd 
secfuer.ee 


Amine acid seyoent cor. taimng sicsua) peptide ~ 1 
■A=A2arunc, OCystcine, D=Aspartic Acid, E- 1 
Glutamic Acid, F^Phenyl alani ne , G-Glycine, 1 
K=Hist iriine* 1=1 soieucine, K=Lysine, 
1:-Ijeucine , M=Hethiuni no, N=Asparaqi ne , 
P = Proliite / Q-Glutamine, R-Arainine , 
S = Serine, T- Threonine, V*Va2ine, 

Tryptophan, Y-Tyrcoine, X~Unknown, *=Stcp 
Codon, /-possible nuc] eotice deletion, 
\=possible nucleotide insertion) 








CSbDVl.DLCEGDLSPGLTDSTAPSSELGKJDDLEEUlAAA\OKME 
EVAEAAPOEIuDTI ALASEKAV E7DVMNQ\RQT\TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


3 


2 2 1-3 


}:elg0veyvftdktgtltenemofrecs2ngmkyoe2ngrlvpe 
gptpdssegnlsytsslslllnnlshlttsssfrtspbnetelik 
lhd1.ffkavslchtv0innv0tdctgdgpwqsn1aps0leyya5 
sppekalveaaarigjvfignseetmevktlgklerykllhjle 
f dsdrrr ms v 1 v qapsge kll pakgaess 1 lpxc1 gc-e 2 ektr 1 
hvdefalkglktlciayrkftskeyeeidkr:feartaloor\e 
1- kj..aavfcfj ekdullgatavedrlqdkvretjealrmagikv 

vnWTGD KHETAVS VSLS CGHFHRTMN I LELJ NQK5 DSECAEQLR 
CL/vRRITEPHVIOHGLXTDGTSLSlALRFHEKLFMEVCRNCSAV 
L CCRMAPL0KAKV IRL 3 KI SPE KP I TLAVGDGANDVSM 2 OEAH V 
GIGlMGKEGR0AARNSDYAIAPFKFLSKLLFVHGHFYY3RlATb 
VQY FFY KNVCFI TPQFLYOFYCLFSQQTLYDS VYLTLY \N I CFT 
£LP1LI YSLL»E0HVDPHVl>QNKPTLYRD2SKNRLLfSIKTFLYWT 
3 LGFSHAF3 FFFGSYLL2GKDT£LLGNGCMFGNWTFGTLVFTVM 
V J TVTVXMAloETHFWTWJ NHLVTWGS 1 1 FY FVFS LFYGG2 LWPF 
LGSQmVFVF IQLljSSGSJWVfil ILMWTCLFLDI I KKVFDRHL 
}lPTSTEKA0LTETNAG2KCLDSMCCFPEGEAACASVGRMIiERV3 
GRCSPTKISRSWSASDFFYTNDRSILTLSTMDSSTC 


6008 


45S4 




AG VR RAG ARKG PG RAI>F AGATAVFPP SARR RRRCPAPEHAG PAR 
ASRPS0ETMFOLPVNNLGSLRKARKTVXKll,SDiGLEYCKEHlE 
rFKOFEPKDFYLKWTTWEDVGI.WnPSLTKNQDYRTKPFCCSACP 
FSSKFFSAYK.SHFRWHSEDFENR2LLNCPYCTFNADKKTLETH 
J K2 FHAFNASAPS SSLSTFKDKNKNDGLKPKQADS VEQAVYYCK 
XCTYRDPLYE 2 VR KHI YREHFQHVAAPY 2 AKAGEKSLNGAVPLG 
£ MAREES S 2 HCKR CLFMPKS YEALVQHVI EDHER 2 G YQVTAMIG 
HTNVWFRSKFLMLIAPKPODKKSMGbPPRIGSIASGNVNRSbP 
SOOKVNRLSIFKPNLWSTGVNM^SVIIIXKJNNYGVKSVGCGYSV 
GQSMRLGLGGNAPVS I POQSQS VKQLLPSGNGRS YGLGSEQRSQ 
APARYSLQSANASSLSSGQLXSPSl^QSQASRVLGQSSSKPAAA 
A7GPPPGNTSSTOKWK1CT2CNELFPEMVYSVHFEKEHKAEKVP 
AVAJ^Y2MKi;-INFTSKCLYCNRYLPTDTLLNHML2HGLSCPYCRS 
T FNDVEKi'AAHMRNVH I DEEMG P KTDST1*S FDLTLQQGSHTN 3 H 
LLVTTYNLRDAPAESVAYI1A0NNPPVPPKPQPKV0EKADIPVKS 
S PQAAVPYKKDVGK7LCPLCFS I LKGPI SDA1AHKLRERHQV2Q 
T^riP VEKKLT Y KC 3 HCLGVYTSNMTASX 1 T LHbVH CRG VGKTON 
GCDKTNAPSRL.NOSPSLAPVKRTYEQMEFPLLKKRKLDDDSDSP 
SFFEEKPEEPWLALD?KGH\EDDSYEARK£FLTKYFT\K2PYP 
TF. RE I EKLAAS LWV \WK\SDI ASHFSNKR KKCVRDCEKYKPGVL 
LGFN>lKELNKVKHEMDFDAEGLFENHDEKDSRVNASK'rADKKl J N 
LGKEDDSSSDSFENLEE2SNESGSPFDPVFEVEPK2SNDNPEEH 
VLKVIPEDASESEEKLDOKEDGSKYETIHLTEEPTKLMHNASDS 
E VDQDDW EWKDGAS PSESGPGS QQVSDFEDNTCEMKPGTWSDE 
S £ QSEVAR SSK PAA KK KATM0GDREQLK W KNSS VGKVEG FWS KD 
0S0HKNAS ENDERLSNPQIEWQNST I DSEUGEQFDNMTDGVAEP 
MHGSLAGVKLSSOQA 


6009 


4272 


2534 


CHGLOHLT F FR EbhTLSl/JG * EPH * AA* QAVRSEEKS 2 C * GS PSC 
HLVLGVLV PVAR0SSKSAG P AOS AFR* TGTGSGTPKAAEQSGYK 
FJ^YTLGHOHVWKFPJQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGE0AS0RJRTVFTAGGGECLGAKSVR7\SVFTGNGPGV1^GLL 
*SG XRGGCF E SG Y LFXr F 1 V I GKI OSLEAKV PLPVNGQTG ERA S PG 
NCR IHI VDAVC* SEHK* DHFLAAAFLENST1 IS* VAPGSWQDHA 
VLOKEVQASVRCRGFESVDTAPAGFWAHSPPGI-QGEPTTTSVSL 
KVlAPQDGEGV P?VEGQLVTVLGLWPQS 1 RHTFVKHTObFLHP 
2 * KI^ALDVAPLHLLTLVCSSFWAYG*GKNGGTTLKOLFAEVN 
AV7RGSAVCRRPS I T2 SS2 HVDTKIOQELKDVWVAGADG WQWG 
DF FWGLAG2 FHL1 DDPLHQ1ELSFQRRV* EQCQGVKPDSQPVP 
R F LR VG I»LQVG PL VRGGGR R VAGRGKR CWRDLLF P WR WG LSH R 7 
RTLLRGGDRGHVVVIVLCRLGSLVGGLGTDELLWFGGR* LI I TG 
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PCT/US0O/34263 



SEQ 
ID 
NO- 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acic 
residue, of 
amino acid 
sequence 


Prr-c. ctec end 
r.-cc . eot i dc 
location 
corresponding 
to i rst 
ammo acic 
res: cue of 
ani.no acid 
sec;:cnce 


Amine acid segment containing signal peptide - 
{A-Alanane, C=Cystnne, D-Aspartic Acid, E* 
Glutamic Acic, F* Phenylalanine, G=Glycint. 
l!=Ilist idine , I-lsoieucine, K=Lysine, : 
li=Jie\icine, M^Methior.ine , N-Asparaaine, 
^--Proline, O^Glutemme, R-Arcjinine, 
S=Setine, T=Tnreomne, V=Valine, 
fc=Tryptcphan, Y=7yrosine, X=Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion} 








1 * * RGR LSGEWGCGbGRGELFQVS I G 1 GVS JVHI GOGDHEV LGG 
AGliVERGALHATGQGVEALVOOLLDVGPAGAtGLCDGAALFCXSP 
GRVG0LPAEGLOVCITLVAOWRMHDGRKLGGAKWPWQALHGAAI 
CGVGGAILLKALSQYF1,KGG*RLWCA3GQ*PVKKRCRRWRG*TR 
P *NGLTI HCFN * LI * GAVCCRLVI LR WCGLLEVHGVVGT* I HCL ! 
GSFPGRLWP* PF1 SQERPKGHCQVmFRLAVPSWKCRWSRWRVRG 
TWRYGN PU ,NLL * GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LPPFQGACRPRTQRCRTWVCPIAVJRQLLAYTRD 


6010 

1 

i 

j 




35.33 


ikpcgssrllrgcwthpnepvsdlsyfdciesvmenskvlgf.sk 1 

AG I SQNAKTGDLPAFGECVGIASKALC3LTEAAA0AAYLVGIFXS 

pn.sqaghoglvdpiofaranoaiqmacqnlvdpgsspsovlsaa j 
tivakhtsalojacriassktanpvakrhfvqsakevanstanl ! 
vkt1 kaldgdfs e3nrnkcri atapl3 eavenltafasnpefvs 
ipa01ssegsoaqepilvsakpklesssylirtarslainpkdp 
ptws vlaghsh7vsds1 ksl1 ts j rdkapgqrecdys idgi nrc 
irdieoaslaavsoslatrddisvealoeoltswoeighlidp 

1 ATAARGE AAOLGHKGTOLAS YFEPLI LAAVGVAS Kl LDHOQQM 
TVLDQTKTLAESALOMLY AAKEGGGN P KAQHTHDAI TEAAOLMK 
EAVDD1 MVT 1 iNF.AASEVGLVGGMVDAI AEAMSKLDEGTP PEPKG 
TFVDYQTTWKYSKAIAVTAOEMMTKSVTNPBELGGLASOMTSD 
YGHLAFQGQMAAATAEPEE 1 GFQI RTR VQDLGHGCI FLVOKAG\ 
ALQVCPTDS YTK R ELI ECARA VTEKVS LVLSALOAGNKGTQACI 
TAATAVGGI J ADLD7T1 MFATAGTLNAENSETFAJDHRENI LKTA 
KALVEDT KLLVSG AAST pr>K LAQAAOS SAATI TOLAE WKLG AA 
SLGSDPPETOWLINA3KDVAKALSDLISATKGAASKPVDDPSM 
YOLKCAAKVMVTNVTSLLKTVKAVEDEATRGTRALEATI ECI KQ 
ELTVFCS HDVPEKTSS PEE S I RMTKG 1 TMATAKAVAASNSCRQE 
DV I ATANLSR KAV SDMLTA CKQAS FK P DVSDEVRTRALR FGTEC 
T1GY LDLLEHVLV 1 ^QKPT P ELKQQLAAFSKRVAG AVTEL 1 OAA 
E AMKGTE WVDP EDPTV lAET ELLGAAAS I EAAAKKLEQLK P RAK 
PKOADETLDFEEO 3 LEAAKS I AAATSALVKSASAAQRELVAOGK 
VGSI PANAADDGQWSQGL1 SAARMVAAATSSLCEAANASVOGHA 
S EEKL3 S S AKQVAAS T AOL L V ACJCV KADQDS E AM R RLQAAGN A V 
KRASDKL VR AAQK A AFGKADDDD WV K'i'K FVGG 3 AQI I AAOEEM 
LKKERELEEARKKLAQIRCOOYKFLPTELREDEG 


6011 


446 


3835 


L LOP AMR KSPGLSDCLWAW1 LLLSTLTGRSYGQPSLQDELKDNT 
TVFTR3 LDR LLDG YDNRLRPGLGERV7EVKTDI FVTS FGPVSDH 
DMEYT3 DV F FRQS W KDERLK F KGPMTVLRLNNLMAS K I WTPDTF 
FllNGKKSVWlNKTMPMKLLRITSDGTLLYTMRLTVRVAECP^F 
GRDFPM\ DVAHACPLK FGSyAYTRAEV^VYSHTREPARSVVVAED 
GSRLJ^OYDLLGOTVDSGlVOSSTXSF^rVVMTTHFHLKRKIGYFVI 
OTYLPCJ MTV I LSQVS KWLNR ES VPAR7VFGVTTVLTMTTLS I S 
ARN SL PKVA Y ATAM DW F 1 AVC YAFVFS AL I EFATVN Y FT KR G YA 
WDGKS W PE KPKKVKD PLI KKNNTYAPTATSYTPNLARGDPGLA 
T1AXSATZ EPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
GI FNLW WATYLNR EPQLKAPTPHQ 


6012 


351 




PAELF0SFAIWHKELYDWRLGPWNOC0PVISKSLEKPLSC1KGE 
EG I QVRE I AC I OKD KD I PAEDI I CEY F EPKPLLF.QACLI PCQQD 
C1VSEFSAWSECSKTCGSGL0HHTRHV\^APPQFGGSGCPNLTEF 
0VC0SSPCEAEELRYSLHVGPWSTCSKPHSRQVROARRRGKNKE 

nT^t/T\r»ovr'Trvr»r>r > 7\T)PT Tl/VI/DU DXTDf~,l\JD/^fXJtf VUr") TOT (~2Vf~)Tlf 

RE KD R SKt* V KJJ y tAKr. Li l K. K kjcn kn KyrixytNMtwyiynjivm 
EVMCINKTGKAADLSFCQQEKLPMTFCSCVITKECQVSEWSEWS 
PCSKTCHDMVSPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
OGDGWPCATYGWRTTEWTECRVDPLLSQODKRRGNOrALCGGG 
I OTRE VYC VOANEKLLS QI.STH KNKEAS K PMDLKLCTG P I PNTT 
OLCHI PCPTECEVS PWSAWGPCTYENCNDQQGKKGFKLRXRR3T 
NEPTGGSGVTGNCPHLLEA1PCEEPACYDWKAVRLGDCEPDNGK 
ECG PGTQVQEWCINSDGEEVDRQLCRDAI FP I PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NSSAI^E^SCWraPCTVYHWOTGPWGOCIEDTSVSSFNTTTTW 
NGEASCSVGWOTRICVJCVRWVGQVGPKKCPESU^PETVRPCLL 
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SEQ 
ID 

HO: 

! 

\ 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuciect iric 
locar ion 
corresponding 
to first 
amine acid 
residue of 
amine acid 
seguf ncc 


Amino Fic:d srf.r.ent rontainmc signal peptide"" 
(A^Alanine . C* Cysteine, D-Aoporric Acid, 
Glutamic Acid, F« Phenyl alanine , G=Glycint, 
K-Histidinc, 1 - ' aol cucine , K=Lysine, 
Ls Leucine, M- Wet hioni ne, N^Aspar agine , 
P^Proline, Q=G.~i <- ; t amine , R-Arginine, 
SrSerine, T=Threonine, V=valir.e, 
Wr Tryptophan, Y=Tyrcsine, X^Unknown, +=Stop 
Codon, /=possible micleotide deletion, 
\=possible nucleotide insertion) 


i 

1 

i 






PCKKDC1VTFY5DWTSCPS\SCKEGDSS1RKQSRHRVI IQLFAN 
GGRDCTDPLYEEKACi:APOACOSYRW\KTKKH\HRCQ\LVP\VfS 
VOODSP\GAQEGCGPGROARA1TCRKCDGGOAGIHECLOYAGPV 
FALTQACOIPCODDCyLTSKSKFSSCNGDCGAVRTRKRTLVGKS 
KKKEKCKNSHLY PL! K TQY C P CDKYNAQP VGNWSDC 3 LPEGKVE 
VLLGMKVOGDIKECG0GYRY0AMACYDONGRLVETSRCNSHGYI 
EEACI I PCPSDCKLf rWSNWSRCSKSCGSGVKVRSKWLR2KPYN 
GGRPCPKbDHVNQACV^E\A?PCHS0CNOYLWVTEPWS:CKVTFV 
NMRENCGEGVOTRKVRCHONT/vDGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDGVISEWGPWT^CVLPCNOSSFRORSADPIROPADE 
GRSG?NAVEKEPGNLKKNCYHYDYNVTDWSTCOl>SEKAVCGNGl 
KTRMLDC VR S DO KS VDL K Y CEA LGLE1CNK0MNTSCMVECP VNCQ 
LSDWSPWSECSOTCGl/.GKMIRRRTVTOPFOGDGRPCPSLNiDQS 
KPCPVKPGYRWOYGOWSPCCVQEAOCGEGTRTRNISCWSDGSA 
EDFS KWDEEFCAD IE LI I DGNKNMVLEESCSQPCPGDCYLKDW 
SSWSLC0LTCVNGEDLGFGG1QVRSRPV2I0ELENQKLCPEOML 
ETKSCYUGOCYEYKVit^ASAWKGSSRTVWCORSDGINVTGGCLiVM 
S0PDADRSCNPPCSQFHSYCSETKTCHCEEGYTEVMSSNSTLEO 
CTL'PVWLPTMEDKRGDVKTSRAVHPTOPSSNPAGRGRTWFLQ 
PFGPDGRLKTW\rYGVAAGAFVl,LlFIVSM3YIACKKPKKPORRQ 
MNRLKPLTLAYDGDADK 


6013 


1161 


710 


gaf:agvpvqpvlirvpnsldttswawrgpgvlkvlwltasqpc 
s3 vdveflpvyh pspe esiidptlyannvqrvmaqalgl patece 
fvgslpviwgri.kvalepol/wgtgksasfgwavrklcgrwgr 
arpesndqpgkvcqaa7al 


j 6014 
» 

i 


2657 


61 3 


eavaggmeksrmnlpkgpdtl.cfdkdsfwktdfdvdhfvsdchk 
rvoleelrddlelyykllktamvelinkdyadf\vnlstnlvgm 
dkal^olsvplgolrfevlslrssvsesiravdermskoedirk 
kkwcvlrl10v1 rsvfk3 e kl lnsqs s kets aleass p lltgq i 
ler1ate fkqlq fhacqs k \gmplldkvr priagi tamlqqsl>e 

GLLLEGLQTSDVDI I R H CL R TY AT I DKTR DAE AL VGQ VLV K P Y I 
DEV 1 1 EOFVESH PSGLQVMYMKLLEFVPHHCPJjLREVTGG A 1 SS 
EKGNTVPGYDFLVNSWPQI VOGLEEKLPSLFNPGNPDAFHEKY 
TlSNDFVRRLEROCGSOASVKRLRAHPAYKSKNKKWNLPVYFQl 
RFREIAGSLEAA.LTDVLEDAPAESPYCLLASHRTWSSLRRCWSD 
EKFLPLLVHRLWRLHSGR FWARYSVFV\N\FLSLRP2SNES PKE 
IKKPLVTGSXEPr.ITCGNTEDOGSGPSETKPWSISRTOLVYW 
ADLDKLQEQLPELLE1 1 KPKLEWIGFKNFSS1SAALEDSQSSFS 
ACVPGLSSKIIODLSD?CFGFLKSALEVPRLYRRTNKEVP'TTAS 
SYVDSALKPLFQLOSGKKDKLKOAII0OWLEGTLSESTHKYYET 
VSDVLNSVKKMEESLkTUiKOARKTTPANPVGPSGGMSDDDKIRL 
OLALDVEYLGEQ I OKI C-LQASDI KSFSALAELVAAAKDOATAKQ 
P 


601S 

) 

i 

i 
i 

i 

! 

> 


13 


2237 


AEGCA ER KGTE P W ELS M S W E SG AG P G LGSQGMD L VWS AW Y G KC 
VKGKGS LPLSAKG I WAWLSRAEWDQVTVYLPCDDHKLQR YALN 
RITW;RSR5GNELPLAVASTADL1RCKLIjDVTGGLGTDELRLLY 
GMALVRFVNLI S ER KTKF AXV PLKCLAQEVN I PDW I VDLRKELT 
HKKMPH INDCRR GC Y Fi'LDWLOKTYV/CROLENSLR ETWEL E EFR 

EGI eeedqeedkni wddi teokpbpqddgkktesdvxadgdsk 

GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
\r a t v? LtMMocODvrrui > n vptrppcMDCJiT/i st orV'TTT \7D*T* 

K/>WJvNi'i>i J K V £l_ vLiAfcLtKOv 1 ^t-NKcAV LUJ\r L*Ul->{yt i-i Vr 1 

FECLAALQIEYEEWDLNDVLVPKPFSQFWOFLLRGLHSQNFTO 
ALLERMLS ELPALG 1 SG I R PTYI LRWTVELI VANTXTGRN ARRF 
SAGQWEARRGWRLFNCSAS LDWPRMVESCLGSPCWASPQLLR I 1 
F\KAMGQGLODE \ EQEKLLR I CS I YT0SGENSLVQ2GSEAS P IG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQOEEGGSVNDVKEEEKE 

ekevlp do veee eekddqeeeeededdeddee edrmevg p fs tg 
qes f taj2n arllaq xr g alogs awq vs s edvr wdtr p \ lgr m pr 
srprtpaelmlenydthvjfwtkpvl\eorlepstck\tdtlgl 
\ scg vg s \ gncsnss sskfrgaflxeargslk \gl\ ktglolf 


1 6016- 


13 


2237 


A5GCAERRGTEPWELSMSVJESGAGPGLGSOGMDLVWSAKYGKC 
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WO 01/53312 



PCT/USMVJ4263 



ID 
NO: 


Fredi ct*»( 
begir.nir.c 
nucleot ic 
locat ion 
corresponding 
to first 
amino acic 
residue c; 
amino ocic 
sequence: 


Predicted end 
nucleotide 
location 
correspond! ng 
to tirst 
e.nino acid 
residue of 
amino acid 
sequence 


Amine acid secoent containing signal peptide 
(R=A."^anine, C = Cysteine, D-Aspartic Acid, E* 
Glutamic Acid, F= Phenyl al anine , G^Giyrine, 
H=Ki sti ci ne , I -Isoleucine , K=Lysinc, 
I*=Le\jcine, ^Methionine , N^Asparagmc , 
F^Froline, Q-Glutamine, R=Arginint, 
S=Serine, T-Threonine, V=Valine, 
W=Trypt cphan, Y=Tyrosine, X=Unknown, *^Stop 
Codon, /=pcssible nucleotide deletion, 
\=pcssible nucleotide insertion} 








VKGKGSJjPLSAHGIWAWLSRAEVJDOVTVYLFCDDHKLORYAIjN 
RITVVJRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVK F VN L 1 SERXTKFAKVPLKC1AQEVN1 PDW3 VDLRHELT 
HKKMFKlNDCRRGCYFVbriWbQKTYVfCROLENSLRETWELEEFR 
EGIFKEUOEEDKNIWDDITEOKPEPODDGKSTESPVKADGDSK 

^Crriinrui^i/t/iM <r irv^T vcDHT)?r t vcvc^cnCMfi r*V m vt n 
OotbVroHCK lU^LSHKx^bYbKAKcLbvb ihcLOr I VLh&rHYhP 

KAIKA^NWPSPRVECVIJVELKGVTCKNREAVLDAFLDDGFLVPT 

FEGIiAALOlEYEENVDLNDVLVPKPFSQFWQPLLRGLHSONFTQ 

A1.LERWLSELPALGISGIRPTYILRWTVELIVANTKTGKNARRF 

SAGOWEARRGWRLFNCSASLDWPRMVESCLGSPCKASPOLLRII 

F\XAMGOGU?DE\EQEXLLR1CSIYTCSGENSLVOEGSEASP1G 

KS PYTLDS LY WSVKPASSSFGSEAKACOOEEQGSVNDVKEEEKE 

EKEVLPDOVEEEEENDDOEEEEEDEDDEDDEEEDRMEVGPFSTG 

OESPTAENARbLAOKRGAI^GSAWQVSSEDVRWDTFPNLGRKPR 

SRPRTPAFXWL,ENYDTHV1FWTKPVL\EQRLEPSTCK\TDTL-GL 

\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLK\GL\KTGLQLF 




2 0? 


3469 


^HOEIEONSAMAPKKRGGRGISFIFCCFRNNDHPEITYRLRNDS 
NFALQTME PALPMP PVE ELDVMFS ELVDELDLTDKHREAM FALP 
AEKKWQ1 V CS K KKDQEENKGATSWPEFY 3DOLNSMAARKSLLAL 
EKEEEEERSKTI ESLKTALRTKPMR r VTR F I DLDGLSCJ LNFLK 
TMDY ETSESR I HTSI>1 GCI KALMNNSQGRAHVLAHSES 1 NVIAQ 
SLSTENI KTKVAVLEI LGAVCI.VPGGHXKVLQAMLHYQKYASER 
'I'RFQTLINDLDKSTGRYRDEVSLKTAIMSFIMAVLSOGAGVESL 
CFRLHLRYE\FLMLGJHPVMDKLRKHENSTLDRHLDFFEMLRNE 
CELE FAKK FELVH I DTKSATQJKFELTRKRLTHSEAYPHFMS I LH 
HCL0MPYKRSGNTVCYWLLLDR31QQ1VIONDKGODFDSTPLEN 
FN I Rmr^RMLVNE^VKQWKEQAEKMRKEHNELOOKLEKKEREC 
DAKTOFKEEMMQTLNKMKEKLEKETTEHKOVKQQVAELTAObHE 
LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLI.PPPPFPPLPGGM 
LPPPPPPLFPGGPPPPPGPPPIjGAIMPPPGAPMGLALKKKSIPQ 
PTNALKSFNWS KLPENKLEGTVWTEI EOTKVFKIl,DLEDLERTF 
SAY0RQ0DFFVNSNSKOKEADAIDDTLSSKLKVKELSV2DGRRA 
0KCNI l.LSRLKLSNDEI KRAI LTMDE DL P KDMbEQLLK F V PE 
KSDIDLLEEHKilELDRKAKADRFLFEMSRINHYQQRbQSbYFKK 
KFAER VAEVXP KVEAIRSGSEErVFRSGALKOLLEWbAFGNYm 
KGORGNAYGFKISSLNKIADTKSSJDKNITLLHYlilTlVENKYP 
SVLNLNE21jRD 1 PQAAK.VNMTELDKE 1 STLRSGLKAVETELEYQ 
X5QPPOPGDKFVSVVSQFITVASFSFSDVEDLLAEAKDLFTXAV 
KHFGEKAGKIOPDEFFGIFDQFLQAVSEAKOENENMRKKKEEEE ' 
RRARMEAOLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKPNRKR1TN0MTDSSRERPITKLNF 


6018 


13 


2510 


t i sqsgg i rrr reavwfe wnmdfsr lhm ysppocvpentgyty 
alsssyssdaldfetehkldpvfdsprwsrrslrlattactlgd 
geavgadsgtssavslknraarttkorrstnksafsinwsrov 
tssgvs yggtvs lqdavtrrp pvldes w 1 reottvdhfwglddd 
gdlkggnkaaic^ngdvgagaatghngffcsncnmlsh:rkdvlt 
ahpaapgpvsrvysrdrnqkcddckgkrhldahpgragtlwhiw 
acagyfllq 1 lr rl gavgqavsrtaws alwlawapgkaasg vf 
wwlg igv1 y 0 fvtli s wlnvflltrclrni ckflvlli plfllix3 

plogdseafpwhwmsgveoqvaslsgochhhgenlrebttllqk 
loarvdomeggaagpsasvrdavgqpfretdfmafhoehevrms 

KLEDILGXLREKSEAI0KELEQTKQKT1SAVGEQLLPTVFHLQL 
ELDQLKSELSSWRHVKTGCETVDAVQERVDVOVREMVXLLFSED 
OOGGSLEOLLORFSSQFVSKGDbQTMLRDLQLOILRNVTHHVSV 
TKObPTSEAWSAVSEAGASGITEAOARAIVNSALKbYSODKTG 
MVDFALESGGGS I LSTRCSETYETKTALMSbFGI PLW Y FSQSPR 
WIQPD3 YPGNCWAFKGSQGYLWRLSMMIHPAAFTLEH1 PKTL 
SPTGNISSAPKDFAVYGI.RNEYQEEGOU^FTYTXJDGESLQMF 
0ALKRPDDTAF0IVELR1FSNWGHPEYTCLTRFRW-GEPVK 


6C15 


2 


1066 1 


TPNDREP PPQRPPSSRRASHLAQEITSAASLGDQTQI LGSLTTA 
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SEQ 
ID 
NO: 



6020 



6021 



Predicted 
beg inn inc. 
nuclect ice 
1 oca tier, 
corresponding 
to first, 
amino acio 
residue c: 
amino acid 
sequence 



"predict «-c end 
nucl eot : ce 
locat ior. 
corresponding 
to firs; 
amino acid 
residue of 
amino acid 
sequence 



549 



495;- 



545 



Ammo r.r-c segment containing £)cnai peptide 
(A-Aj ?:.±i:*z , C = Cysteine, D»Aspartic Acid, F.~ 
Glutamic ;*.cid, F-Phenylalanine , G=-Glyt:ine , 
H-Hist jz.r.e, IcIgoI eucine, X=Lysine, 
I.s-Lf-uc: :i<_ , M=Methioni ne, N^Asparag ine , 
?iProlu:(. ( G^Glutsmine , R^Arginir.c , 
S^Serme , T=Threonine, V=Valine, 
WsTrypt orhcn, Y^Tyiosine, X- Unknown , *i-Stop 
Coder*, / --possible nucleotide deletion, 
\srpossibie nucleotide insertion) 
TVlTSh tr. F MPGJSSQILTNAQGQVJGTLPWWNSASVAAPAPA 
OS LQVQ; .V T ?0 LLLNAQGQV I ATLAS S PLP P P VAV RK\ P STPES 
hhKSBVQ'r 1 KFTPTVPQPAVVIASPAPAAKPSASAPIPITCSET 
F T VS OLVS K ? 'H TPS LDECG I N LE E I R EFAKN FK 3 RR LS LGLTQT 
QVGDAL': ATI:G FAYSQSA3 CRFEKLD1TPKS AQKLKPVLEKWLN 
EAELPNQi C-OCN!,WEfVGGEPSKKRKRRTSFTPOAl EALNAYFE 
KNPLPTGCL I TEIAJCELNyDREWRVWFCRRROTLKNTS KLNVF 

OTP 

"EAIQFlTvT j ?NVGNKrDTTCKPLASTTOYSRAVFDGNVYYYLPW 
AHTKPVVTLrSYHEDISJIRLDAVNTLLAMAERLQTNIEALKSGI 
0GXlPANQL^.r.l,KbKLIDEV3EDTRYTLPl»TF.GKANVTVLD'TOI 
RKLRSRSl i 01 HEAAVRMRSEATDVKSTLAEI EDWLDKLMQLTE 
EPONSKPO: 7 J WM1RGEKRLAYARTPAH0VLYSTSGENASGKYC 
GKTQTl F L XV PQEKN14GPKVPVELRVN3 WLGLSAVEKXFNSFAE 
GTFTVF/JJKVFNCALMFGKKGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWI WKGEW3VDPERSLLTEADAGHTEFTDEVY0NESRYP 
GG DWK PAE I: TDANGDKAAS PS ELTC P PG WEW EDDAVJS Y DINR 
AVDEKGWEYGJ T1PPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LT0TAS.VTAG^r.EL0D0EGWEYASLIGWKFHNKORSSDTFRRR 
RWRRKMAPJ-' KTHGAAAI FKJbEGALGADTTEDGDEKSLEKQKHSA 

rrVKGAKTF : vsckfdrdyiyklrcyvyoarnllaldkjdsfsdp 

YAH I CFl.r »: £■ KTTE 1 1 HSTLNPTWDQTI J FDEVEIYGEPQTVLQ 
KPPKVlKFl,rONDQVGKDEFLGRSIFSPWK\,NSEMDlTPKLLW 
^PVKWGr v r^\CGDVI,VTAELILRGKDGSHLPIl»PPQRAPNLYMVP 

qg i r pwc: -'j a i eilawglrnmknfqmas j tspsl.wecggerv 
esvviknikktpnfpssvlfmkvflpkeely^pplvikvidhro 
fgrkpvvgqcti erldrfrcdf yagkedivpqlkasllsappcr 
d ] v i emei r k fllaskcls sms ta1>s kmas patvh lte keeei v 
dwwskfyas: sgeh?:kcgqyickgysklkiyncei*envaefeglt 

DFSDTFKJ Y KGKSDENEDPSWGEFKGSFR J YPLPDDPSVPAPP 
ROFRELPDS VPQFCTVR I YI VRGLELOPODNNGLCDPY I K I TLG 
KKVIEXDRI'^YIPNTLUPVFGRMYEbSCYXPOEKDLKISVYDYD 
TFTRDEKVC >:ri IDLENPF\LSRFG\SHCG\ I PEEYCVSGVNTW 
RDSLR \ PTC M ,LQ*VARFKGFPCP ILSEDGSR I R YGGRDYSLDE 
FEANKILHCKLGA?EERIALHILRTQG^VPEHVETRTI«STFQP 
NISNRYYLRY] 1 WNTKDVILDEKS1TGEEMSDI YVXGW1 PGNEE 
NKCKTDVHY}*? hDGEGNFNWRFVFPFDYLPAEOLCIVAKKJEHFW 
SIDQTEFR7 l r R \LI IQI W\DNDKFS \LDDYLGFPRTLTCRHTI _ 
HFLOKSPGGKC/ RGLDM1 PDLKAWPLKAKTASLFEQKSMKGWW 
PCYAFKDGAk VMAGKVEMTLEI LNEKEADERPAGKGRDEPNMNP 
KLDL PNR P IT'J S I ' L W FTN PC KTM KFI VWRR F KWV 1 1 G LLFLLI LL 
LFVAVLLY S 1-r-KYLSNKI VKPNV 



EAICFEVS } G N V GN K FDTT CX P LASTTQ Y S RAV FDGN Y YY YLP W 
AHTKPWT LT£ Y VI ED I S HRLDAVNTL LAMA E RLQTN I EALKSGI 
CGKI PANCH^.F I A-JLKXIDEV1 EDTRYTLPLTEGKANVTVLDTQI 
RXLRS R SLS Q j K EAAVRKRSEATDVKSTLAE I EDW LDXLMQLTE 
EPQNSMPDI i 1WM1RGEKRLAYARIPAHQVLYSTSGENASGKYC 
GXTQTI FLKV POEXNNGPKVPVELRVNI WLGbSAVEKXFNSFAE 
GTFTVFAEKYHNOALMFGKWGTSGLVGRWKFSDVTGKIKLKREF 
FLPPKGWEKBGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 

ggdwk pae dt y toangdkaas ps eltcp pgweweddaws ydi nr 
AVDEKGWEYGj ti ppdhkpkswvaaekmyhthrrrrlvrkrkkd 

LT0TASSTAGA.w.EEl>QDQEGHEYASLIGViKFHWKORSSDTFRRR 
RWRRKMAPS ETHGAAAI FK LEG A LGADT TEDG DEKS LEKQKHS A 
TTVFSANTf J VS CVFDRDY1 YHL R CYVYQAR NL LATiDKD S FSD P 
YAH I CFLKhS KTTEI IHSTLKPTWDQT1 1 FDEVEIYGEPQTVLQ 
NPPKV I ME L r ?/!^D0VGKDEFI^R S I FSP WKLNSEMD2 TPKXLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLPILPP<?KAPNLYMVP 

qg 1 r pwqlt a i e 1 lawglrnmkw fqmas i tspslwecggerv 
eswiknlkk:?nfpssvlfkkvflpkeelympplvikvidhrq 
fgrkpwg0ct7erldrfrcdpyagxedivp0lkasllsappcr 
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SEQ 
ID 
NO: 


Preo) cted 
beginning 
nucleotide 
locat 2 on 
corresponding 
to 1 iz st 
amino acid 
residue ot 
amino acid 
sequence 


Predicted end 
nuci eoticc- 
3 oca r i or. 
corresponding 
to firsi 
amino acic 
residue ol 
amino acic 
sequence 


Amine acid segment containing siarioi peptide 
:A=AZanine, C=Cysteine, D=Asparlic Acid, E- 
Glutamic Acic, F=Phenylfi iannne, G=Glycine, 
K = })ist idine, 3 = Isoleucine , KsLysme . 
h- Leucine, Ms-Methioninr. N~Asparaome , 
P- Proline, C=G1 utnmine , K=Argmine, 
S- Serine, T=Threonine , W Valine, 
^Tryptophan, Y»Tyrosine, X=Unknown, * = 3top 
Codon, /-possible nucleotide deletion. 
Vpossible nucleotide insertion) 








)d:viemedtkp llas KCLSSMSTA 1 ,s KMAS patvhlt e k EEE I V 
d w s kf yas s g eh e k cgq y 1 0 k g y s kl k 3 ync e lenv a e 7eg lt 
rf5? dtfklyrgksden£dpsvvgefkgsfriy?lpddpsvpapf 
k0frelpdsvp0ectvri y i vrglel0p0dnnglcdpyi kitlg 
kkvie\dkdhyi pntlnpvfgrkyelscylpoekdlkisvypyd 
tktrdekvget11dlenpf\lsrfg\shcg\1peeycvsgvntw 
rdslr\ ptq\llqnvarfkgfpqp i l5edgsr i r yggrdyslde 
feakkt lhqhu3apeerla1hi lrtqglvpehvetrtlhstfqp 
n 1 s \ r yy lr v 1 1 wkt kdv 1 ldeks 1 tgeems d i yvkgw i pgne e 
n kg ktuvhy r sldg egnfntf rfv fp fdylpaeq ix i vakkehfw 
sidqtefr3 ppr\li iqiw\dkdkfs\lddylgfprtltcrhti 
ukl0kspggnc/rgldmipdlkamnplkaktaslfeck5mkgww 
pcyaekogarvmagkvehtle1lneksajderpagkgrdepnmnp 
kldlpnrpetsfi j wftnpcktkkf:vwrrfkv;viigi,lfllill 
l fvavlrlys lpn y lsmk i vkpkv 


6022 


4953 


545 


EA10FEVS1GNYGNKFLTTCKPJLASTTQY.SRAVFDGNYYYYLPK 
AHT K P WTLTS Y WED3 SHRLDAVNTLLAMAERLQTN 1 EALKSG I 
OCX 3 PANOLAELWLKL3DEV3EDTRYTLPLTEGKASVTVLDT01 
R X LR S R S L S Q 2 H E AAVRMR S E ATDV K S TLAE 1 EDW LDKLMQLTE 
EPONSHPDI I IWMIRGEKRLAYAK3 PAHCVLYSTSGENASGKYC 
GKTO7IKLKYP0EKNNGPKVPVELRVN1WLGLSAVEKXFNSFAE 
GT FT V FAEMY ENQALM FGK WGTSGLVG 3 H K F£ DVTG K I KLKREF 
Fl'PPKGWEKEGEW J VDPERSLLTEADAGHTEFTDEVYQNESRYP 
GC-D W K PAEDTYTDANC-DKAAS PS E LTC P PGWEWEDDAWS YDI NR 
AVDEKGWEYG1TI PPDHKPKSWVAAEKMYHTHRRRRI.VRKRKXD 
bTOTASSTAGAMEELOI>0EGWEYASL:GWKFHWKQRSSDTFRRR 
R WR R KMAP SE THG AAA 3 FKLEGALG7d)TTEDGD E K S LEKQ KH S A 
TTV FGANTP J VSCNFPRDY3 YHLRC YVYOARNLLALOKDS FSDP 
YAH3 CFLHRSKTTE3 IKSTLNPTWDOTI 3 FDEVEI YGEPQTVbQ 
N?FKVIMELFDNDQVGKDEFLGRSIFSPVViaNSEMDITPKLLW 
H PVMNGDKACGDV1>VTAELI LRGKDGSNLPI LP PQRAPNLYM VP 
0G3 RPWQLTA3 EI LAWGLRNMKNFOMAS ITSFSLWECGGER V 
ESWJXNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRO 
KGR K P WGOCT 3 ERLDR FRCDPYAGKEDI VPQLKASLLSAPPCR 
D3VlEMEt)TKPLLASKCLSSMSrALS?J4ASPATVKLTSKEEEIV 
DWWSKFYASSGEHEKCGQY30KGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFR1YPLPDDPSVPAPP 
RoFRELPDSVPOECTVRIY3VRGbEljOPOr>tWGbCDPYIKITLG 

kkvie \drdh y 3 pntbn pvfgrm yelsc y lpqekdlk 3 svy d y d 
tftrdekvgeti 3diienpf\lisrfg\shcg\3peeycvsgvntw 
rdslr\ptqVllqnvarfkgfpqpilsedgsriryggrdyslde 

FEANK3LH0HIiGAPEERLALHIi*RT0GLVPEHVETRTLHSTFQP 
N3S\RYYLRVI1WNTKDVILDEKS3TGEEMSD3Y\'KGWIPGNEE 
N KCKTDVH YRSLDGEGNFNWRFVFPFDYLPAEOLC I VAKKEH FW 
SlD0TEFRIPPR\LIIOIW\DNDKFS\I>DDYLGFPRTLrCRHT3 
HFLQKS PGGNC/RGLDMI PDLKAMNPLKA KTAS LFEQKSMKGWW 
PCYAEKIX3AR\^AGKVEmLEILNEKEAI)ERPAGKGRDEPNMNP 
KLDLPKRPETSFLWFTNPCKTMXF3VWRRFXVJVI3GLLFLL1LL 
LFVAVLLYSLPNYLSMK3VKPNV 


6023 


302 




"SOELGMFV^I^LLNTTPDRAEOGKLTLLCDAKTDGSFLVHHFL 

legl/ivcsgrWfoaqkepkplqflreanagnlkplfefvrea 
lk pvdsgearwtypvllvddls viaslgmgavavlidf i hycrat 
vcwelkgnmvvlvhdsgdaedeendillnglshoshlilraegl 
atgfcrdvmgqlr i lwrrpsqpavhrdqsftyqyki qdksvsff 
akgmspav3j 


6024 


3 


3 260 


FLS FLCY PRFRCLFCLQFAI PASRMEQLNELELLMEKS FWEEAE 

lpaelfx?kkwasfprtvlstgndnrylvlavntvqnkegncek 
rlv3 tasqslbnkelci brndwcs vpvepgdi 3hlegdctsdtw 
1 1 djofgyl3lypdmli sgtsi ass 3 ro^rravl.setfrssdpa 

TRQtfLl GTVIjHE VFQKAI NNSFAPEKLOEIiAFQTI OE3 RHLKEM 
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SEC~~~ 

ir> 

NO: 


^recFictec 
beginning 
nucleot ice 
location 
rorreoponding 
to f:rst 
arcuno acid 
residue of 
ammo acid 
secuence 


Predicted end 
nucleotide 
location 
corrC3ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine aciB ze<jment co:iUining signal peptide 
(A=AIanine, C=Cysteine, D^Aspartic Acid, E=. 
Glutercic Acid, F=Phenylalanine, G=Glycine, 
H=!hstidine, l = Isol euciiie , K = Lyslne, 
L-Leucine, M=Methioni r.r , N=At-paragine , 
P = Prcli;ie, Q-Glutaitint- , R=Aroimne, 
S=Serine, T=Threonine, VeV&line, 

Tryptophan, Y=Tyrosine, X-U.nknown, *^Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








YRLNLSQDEI KQEVEDYLPS FC KW AGD F KK KNT5TD F P QMC LS L 
PS DKSKDNSTCN I E WXPtfD I EES I WS PR FGLKUK J DVTVGVKI 
HRGYKTKYKlMPbELKTGKESNSI EHRSQWLYTLbSQERRADP 
EAGLbLYLKTGQMY PV?ANHLDKRELbXbRNQttAFSLFHRl SKS 
ATROKTQIASLPQT lEEEKTCKYCSQIGNCAIiYSRAVEOOMDCS 
SVPIVMLPKI2EET0HLK0THLEYFSLWCLMLTLESQSKDNKKN 
»ONIWl^PASEMEKSGSCIGNl»}RMEHV;<IVCDGOYliHNFOCKH 
GA2PVTNLMAGDRV1VSGEERSLFALSRGYVXE1NMTTVTCLLD 
RNLS VbPESTLFRLDQEEKNCD J DTPLGNLS KLMENTFVS KKLR 
Db 1 1 DFREPQFI SYbSSVLPHDAKDTVAC I LXGLNKPQRQAMKK 
VbLSKDYTLI VGMFGTGKTTT2 CTLVRI LYACGFSVLLTSYTHS 
AVDW I bLKlJiXFKlGFLRSRXQIQKVHPAlQQFTEHEl CRSKSI 

k s \ lalle elytscl1 dattcmg 1 nhpi fsk k i fdfci vdeasq 
isqpicbgplffsrrfvlvgdkqqlpplvlnrearalgmseslf 
krleqnksawqltvqyrmnskikslsnkltyegklecgsdkva 
nav" nlrhfkdvklelefyady sdhpwlmgv f e pnnp vc flntd 
kvpapeqvekggvsnvteakiilvfltsifvkagcspsdjgiiap 
yroolkiikdllarsigmvevntvdkyouNrdksivlvsfvrsk 
kdgtvgeliikdwrrl ,n\'aitrajoikli llgcvpslncypplekl 

LNHI,NSEK1>I I DLPSREHESLCH 1 LGDFCR* 


6025 


397? 


89 


GGFPAOSDHLPPVFPbRSDLLITMSTbWSI-HPDAFPSLRALIA 
/iRYGEACEGPGWGGAHPRICLCPPPTSRTSFPI'PRLPALEOGPG 
GLWYW G ATA VAQLLW PAG LOG P G G SRAAVbVOQWV S Y ADT EUI P 
AACGATLPALGbRSSAQDPOAVLGALGRALSPLEEWLRLHTYLA 
G E A P 7 LADLAA V TALLLP FR Y V L D P PAR R J WNNVTR W FVT CVRQ 
PEFRAV^GEWbYSGARPLSHOPGPEAPALi'KTAAOLKKEAKKR 
ZKLEKFOOKOKIOO^QPPPGEKKPKPEKREK^DPGVITYDLPTP 
PGEKKDVSGFMPDSYSPRYVEAAWYPWWEOCGFFKPEYGRPNVS 
AANPRGVFMMCIPpPNVTGSLHbGHALTNAlODSbTRWHRMRGE 
T IT. WN PG CDHAG 1 ATC* VWE XKLWR EQGLSRKQ bG R E AFLQ E VW 
KWKEEKGDR I YHQLKXLGSSbDWDRACFTMDP KLSAAVTEAFVR 
I.HEEG 1 1 YRSTRbVN WSCTbNS A 1 SD1 EVDKKELTGRTLbS V?G 
YKEKVEFGVTjVSFAYKVOGSDSDEEVWATTR J ETMI>GDVAVAV 
HPKTjTRYOHbKGKNVlHPFLSRSbPlVFDEFVDMDFGTGAVKIT 
rAHDQNDYEVGQRHGLEAISI mdskgalinvf ppflglprfear 
XAVLVALKERGLFRG I EDNPM WPbCNRSKDVVEPbLRPOWYVR 
rGEMAOAASAAVTRGDLRIIiPERH0RTWHAWKDNlRE\WCMFPG 
XLWWCA HR\ 1 PAY FVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 
KAAKEFGVS PDKI SLQQDEDVbDTWFSSGbFPLS 1 LGWPNOSED 
LS VFY PGTLLETGHD I LFFW VAR MVMLGLKLTGR LPFREVYLHA 
IVRnAKGRKMSKSbGNVIDPLDVlYGISLOGUJNObbNSNLDPS 
EVEKAKEGQKADFPAGIPECGTDALRPGLCAYMSQGRDINLDVN 
R I E.GYRH FCNKbWNATKFALRGbGKGF VPS PTSOPGGHESLVPR 
WIRSRLTEAVRLSNC>GFQAYDF?AVTTAOYSFWLyELCDVYLEC 
LKPVLNG VDQVAAE CARtfTLY TCLDVGbRbLS PFMPFVTEELFQ 
RbPRRy.PQAPPSLCVTPYPBPSECSWKDPEAEAALELALSITRA 
VRP\LRAX)YKbHPESGPTCFLEVAD\EATGALASAVSGYVOGPG 
OAQWVA VAS PWGLPAP \0GCAVALASDR CS 1 \ HLOLQG\LbDP 
AREbG\Kb0\AKRVEAO\R0AO\RbR\ERRA\.ASGNPVKVPb\E 
VQEADEAJUiQQTEAEbRKVDEA I ALFQKML 


6026 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVCAVDKKVDCPRLC 
TCEI RPWFTPRS I YMEASTVDCNDLGLLTFPARbPANTQIbLLQ 
TNNIAKI EYSTDFPVWbTGLDbSONNLSSVTNI NGKKMPQbLSV 
Y LEENKLTELPEKCLS ELSNbQE LY I NKNLLST I S PGAFI GbHN 
LLRLHLN S NR LOM 1 NS K WFDAL PNLE I LM I G EN P 1 1 R 2 KDMNFK 
PLINLRSbVLAGI NLTEX PDNALVGLENLES 1 S FYDNRLI KVPH 
VALOKVVNLKFLOLHKNP1NR 1 RRGDFSNMbHLKELGINNMPEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLML 
NSNAXSAbYHGTIESLPNLKEIS J HSNPIRCDCVIRWMNMNKTN 
I RFMEPDSLFCVDPPEFQGC^rVROVHFRDMMEI CbPLIAPESFP 
S NbNVE AGSYVSFHCRATA\E PQPEIYW1T PSGOKbbPNT\ LTD 
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SEC 

ID 
NO: 



Pr edj ctea 
beginning 
nucleot ide 
1 oca t ion 
corresponding 
to first 
nnino ocid 
residue of 
amino acid 
sequence 



6027 



"602 8 



6029 



120 



Pred; cled end 
nucl eot i ce 
locst.cr: 
corresponding 
to firs;, 
amine z cid 
residue of 
amine ^cid 
sequence 



4j4 t 



343; 



3 53? 



Anuno acid segment cent awning signal peptTce - " 
(A=A*i amine , ('-Cysteine, D-Aspartic Acid, '£= 
Glutamic Ac.c, F*- Phenyl aj ar.ine , G=Glycine, 
H=Histidinc, I scieucinc , K=Lysine, 
L^Leucine, /.^Methionine , N- Aspaxagine , 
P«Proline, < ^ Glut arc n& , R- Arginine. , 
S*Serine, T= Threonine, V=Valine, 
W-Tryptophrr , ¥ "Tyrosine, X= Unknown , *=Stcp 
Coaon, /spears ible nucleotide deletion, 
\=possible i.ucleotide insertion) 



KJFYVHSEGTLDJ NGVTPKFGGLYTC1 ATNLVGAPLKSVM3 KVDG~ 
SFP0DNNGSLKJK7RD1OANSVLVSWKASSKILKSSVKWTAFVK 
TENS KAAQSAK ■ F $ DVKVYN LTKLN PSTE YKI C 1 D I PTIYQKNR 
:<KCVWTTK6L?:?D0KEYEKNNTTTLMACLGGLLGI IGV3 CLIS 
CLSPEMMCDGGHF Y VRNY1,QKPTFALGEL,YPPL1NLWEACXEXS 
TS LKV KATV IGLF7NMS 



GGRRAPGRPGPi ] KDEEEETVFREWSFSPDPLPVRYYDKDTTK 
PISFYLSSLEEJ^i/tW'KPKLEPGFWVALEPLACRQPPLSSGRPRr 
LLCHCMMGGYL.I DK FJQGS VVQTPYAKYHWQCI DVFVY FSHHTV 
TIPPVGWTOTAriRHGVCVLGTFITP.WhfEGGRLCEAFLAGDERSY 
GAVADRLVQI T\ R F FR FDG W h I N I ENS LSLAAVGNMPP FLR YLT 
TQLHRQV PGGLVIWYDS\A/0SG0LKWODELNQHKRV FFDSCDGF 
FTN YNVCREEHLE KKLG0AG ERRADVYVGVDVFARGNWGGR FDT 
DKVGGGFRPRAiX-PVPPUJPHFl.MDI.PFPSAPORNDSSCSSOSG 

dpvalrnrcpa; a>:l.cph 



NCLIibOAKGFHC-i-':EDUWWLTDlERiiLi^ASKPU5GLPETAKEO 
LNVHMEVCAAFEAK EET Y KSLMOKGQQMLAR CPKS AETNI DQDI 
KNLKEKWESVETKI.NRRVKT\KLEEALNIA\MEFHNSL\QDFIN 
WLTOAEQTLNVASkFSLI LDTVLFQIDEHKVFANEVNSHREQI 1 
ELDKTGTHLKY FP0KQDWL1 KNU>I SVQSRWEKWQRLVERGR 
SLDDAR KRAKQ F HE AW S X LMEW LE L'S E K S LDS ELE 3 ANDPDK 1 X 
TQLAOHKEFOKCl^AWlSVYD'rTJ^RTGRSIjXEKTSIiADDNLKLD 
DMLSZLRDKWDT • CG KS VERON K LEEA\ L1»FSGOFTDALOAL1 D 
WbYRVEPQLAEDCP'^lGDl DLVMNLI DNHKAFQKELGKRTSSVQ 
ALKRSARELIEGFRDDSSWVKVO^GEUSTRWETVCALSISKOTR 
LEAA^RQAEE FH 5 WHALI ,EW LA EAEQTLR FHG VLFDDEDALRT 
LID0HKEFMKK1 EEKRA£IjNKATTMGDTVLAICHPDS3TTI KKW 
ITI IRARFEEVI^AWAXQHQQRL/vSALAGLIAXQEIjLEALLAWLO 
KAE?TI>TDKDKE V 1 PQEI EEVKALI AEHGTFMEEMTRKQPDVDK 
VTKTYKRRAADPF SLOSH 1 PVLDKGRAGRKRFPASSLYPSGSQT 
OIETXNPRVHLIA'SKWOOVWLLAI.ER^RXLNDALDRLEELREFA 
NFDFDIWRKKYMRWKN-HXKSRVHDFFRRIDKDQDGXITRQEFID 
GI1>$SKFPTSRLEKSAVADIFCRDGDGY3DYYEFVAA1*HPNKDA 
YKP 1 7DADK I EDE VTRQV A KCKCAKR FQVKQ I GDNKYR FFLGNQ 
FGDS0QLRUVRIL5STVMVRVGGGKMALDEFLVKNDPCRAXGRT 
NKBI.REKFILADGASQGMAAFRPRGRRSRFSSRGASPNRSTSVS 
SCAAQAASPQVPATTTPXILHFLTRNYGKPWLTNSKMSTPCXAA 
ECSDFPVPSAEGTPICGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQ FADSKXT PSRFGS KAGS KAG SRASSRRGSDASDFD I SE 
JOSVCSDVETVPQTHRPTPRAGSRPSTAKPSXIPTPQRKSPASK 
LDKSSKR 



" j MPCGSSRLIiRCCWTH PKE PVSDLS YFDCl ES VMENSKVLGESM 
AGIS0NAXTGDL1A FG ECVG I AS KAL CG LTEAAAQAAYLVG 1 FD 
PNSQAGHQGLVDP J QFARAHQAI QKACQNLVDPGSS PSQVLS AA 
T I VAXHTS ALCN A CR I AS S KTANPVAKRH F VQS AKE VANSTANL 
VKTIKAljIX>DFSFD>mNKCRIATAPLIEAVENbTAFASNPEPVS 
IPAQ3SSEGSOAOEPIL>VSAKPMLESSSYI>IRTARS1jAINPKDP 
PTWSVLAGHSHTY SDS IKSliITSI R DXAPGQRECDYS I DG3NRC 
IRDIEQASLAAVSOSLiATRDDl SVEALQE0LTSWQEIGHL1DP 
I ATAARG EAAQLG H KGTQLAS YFEPL 1 LAAVGVAS XI LDHQQQM 
TVLDOTKTLAE6AL0MLYAAXEGGGN P KAOHTHDAI TEAAQLMK 
EAVDDI MVTLNEAAS EVGLVGGMVDAI AEAMSKLDEGTPPEPKG 
7FVDYCTTWKYS KA1 AVTAQEMMTKS VTNPEELGGLASOMTSD 
YGHLAFQGCMAAATAEPEEI GFQI RTRVQDLGHGC1 FLVQKAG\ 
AL0VCPTDSYTKREL1 ECARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSG1 1 ADLDTT 1 MFATAGTLN AENSETFADHRENI LKTA 
KALVEDTKLLVSGAASTPDKLAOAAOSSAATITQLAEVVXJLGAA 
SLGSDDPETQWL j NA I KPVAXALSDLISATKGAASKPVDDPSM 
YOLKGMKVMVTNn'TSLLKTVKAVEDEATRGTRALEATIECIKQ 
ELTVPQSKDVPEKTSS PEES 3 RMTKG I TMATAKAVAAGNS CROE 
DVIATANI^RJCAVSDMLTACKQASFHPDVSDEVKTRALRFGTEC 
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SEC 
ID 

NO: 


Predicted 
beginning 
nucleotide 
locat j on 
cor respond i ng 
to first 
an\ino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuclect. idc 
iocat 3 on 
corresponding 
to first 
r-TT.;no ocid 
residue of 
cnuno acid 
sequence 


Ammo acid seoment coTitaanir.g signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Aerie, F~ Phenylalanine, G-Glycine, 
H=Hietidi.ne , I -3 soleucine, K= Lysine, 
L-Lcucine, ^-Methionine , N=Asparagine , 
P=Proline, Q=GlutcJr.jne, R^Arginine, 
S=Serine, T- Threonine , V=Valme, 
W= Tryptophan, Y^Tyrosine, X=Unknowr., *-Stop 
Codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion) 








tlgy ldlbeh vxv i lqx p tpe l kqq i »aa fs kr vag a v t eli 0 aa 
eamkctewvdpedptvi aetellga.aas 1 eaaakkleqxkprak 
pxoadetldfeeqjleaaksiaaatsalvksasmorelvaqgk 
vgs i panaaddg0ws0gi. i saarmvaaatsslceaanasvqgha 
seekl1s5axqvaastacllvackvkadqdseamrrlqaagnav 
krasdnlvraaok a a fg kaddddvw ktkfvgg i ao 1 i aaqe em 
lkkkreleearkklaq: rqqqykflptelredeg 


6030 


3 


I'm 


fpgrgspajlqlevl 3 clglmglerajbnvl api fyrn1 vnllten 
apwnslawtvtsyvfi,kflqgggtgstgfvsnlrtflvjirvqof 

TSRRVELLI FSHLH ELS LR WHLGRRTGE VLR I ADRGTSS VTGLL 
SYLVFNVI PTLADI 1 3GI I YFSMFFNAWFGLI VKLCMSLYLTLT 
IVVTEWRTKFRRA^TCENATRAJ^VDSLl^FETVKyYNAESYE 
VEKYREAI 1 KYQGL EWKSSASLVLUJQTQNLVIGLGLLAGSLLC 

ay fvteoklqvgdy vlfgt yi iqlykplnw fgty yrm 1qtnfi d 
menmfdllkAetevkdlpgagpfrfokgriefenvhfsyadgr 
etlqdvsftvmpgqtlalvgpsgagkst3 lrllfrfydi ssgc1 
ridgodisqvtoalfrfshmelcpkdtvlfndtiadnirygrvt 
agndeveaaaqaagihda1mafpegyrt0vgerglklsggekor 
vaiartilkapgiilldeatsaldtsneraioaslakvcanrtt 
iwahrlstwnadoilvikdgcivergrheallsrggvyadmw 
glqqgqbetsedtk pqtker 


6031 


160 


16 94 


LRMSENLDKSNVNEAGKSKSNCSEEGLEDAVEGADEALOKAIKS 
DSS S PQRVOR PHSSPPRF VTV^ELLETARGVTWIALAK E I WNG 
DFOIKPVEliPENSLKKRVKEIV}?KAFWDCLSVQLSEDPPAYDHA 

iklvgeiketllsfllpghtrlrnq: tevldldli koeaengal 
d2sklaef1 jgmmgtlcapardeevkklkdikeivplfre1 ksv 

LDLMKVDM^UNFAlSSIRFHLMQOSVEYERKKFQEILKROPNSIiD 
FVTOWLEEASEDLMTQKY KHAI »PVGG WAAGSGDMPRLS PVAVQN 
YAY LKLLKKDHI ,QR PFP ETVLMDQS RFHE LQLQ\ R EOLT3 LGAV 
LLVTFSMAAPGI SSQADFAEKLKM2 VKI LLTPMHLPSFHLKDVL 
TT2GEKVCLEVSSCLSLCGSSPFTTDKETVLKGQI0AVASPDDP 
IRRJMESRILTFLETYLASGHOKPLPTVPGGLSPVORELEEVAI 
Kf ARLVNYH KHV FC PYYDA7 LSK ILVRS 


6032 


39 


2415 


AARLCRAOPTKSAWMIRPLSKMYPQTRHPAPHQPAOPFKFTISE 
SCDRI KFEFQPLQAOYRS LKLECEKLAS E KTFMQRH Y VMY YEMS 
YGLNJ E>IHKOAEIV:<RJJ^A1CAQVIPFLSQFHOOOVVOAVERAK 
QVTMAELNAIIGO0OLCA0HLSHGHGLPVPLTPHPSGLQPPAIP 
PIGSSAGLI^SSALGGpSIJLPlKDEKKHHDNDHQRDRDS I KSS 
SVSPSASFRGAEKHRNSADYSSESKKOKTEEKEIAARYDSDGEK 
SDDNIiWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPlSP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPKGIVPHAG 
MNGELTS PGAAYAGLHN 1 S POMS AAAAAAAAAAAYGR S PWGFD 
PHHKM R V PA I PPNLTG I PGG KPAYS FHVS ADGC/MQ P V P FPP DAL 
IGPGI PRHARQ3 NTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSC?LDCLKRDNYIRSCRLLPE>GRTL1VGGEASTL 
S1WDLAAPTPR3 KAELTS S AP ACY ALA 1 SPDSKVCFSCCSDGNI 
AVWDLHN0TLVRQFOGHTDGASCI DI SNDGTKLWTGGLDNTVRS 

wNdlregrolqohd/fftsfvfslgycpXteewlavgmensnW 
evlhvtkpdkyqlhlhes cvlslkp'ahcgk wf\ vs tg kdnllna 

W\RTPYG\A5IF\0SKESSS\VT>SCDI\SVDDKYIV7GS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAOPTKSAHM IRDLSKMYPQTRHPAPHQPAQPFKFTI SE 
SCDR I KEEFQFL CAQ YHSLKLECEKLASEKTEMQR HYVMY YEMS 
YGLNI ENHXQAE I VKR LNAI CAQVI P FLSQEHQQQWQhVERAK 
OVTMAELRAI I GQQOLOAOKLSHGHG LP V PLT PHP SGLQ? PAI P 
PIGSSAGLLALSSALGG0SHLPIKDEKKHJ1DNDHQRDRDSIKSS 
SVS PSAS FRGAEKKRWSADYSS ES KKQKTEEKEIAARY DS DGEK 
S DDNLWDVSNED F S S PRGSPAH S PR ETvGLDKTRLLK KDAP I S P 
ASIASSSSTPSSKSKEl^LNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPG VD PLAS S LRTPMAVPCPYPTPFG I V PHAG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleot irif 
locaticn 
corresponding 
to first 
ami no acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


.--T.mo ocid segment containing signal peptice" 
!?wAjomne, (^Cysteine, D-Aspartac Acid, E- 
Glutamic /icid, ^Phenylalanine, G=Glycint, 
H^Histidinc, I* Isoleucine , K=Lyt;ine, 
L^leucine, M-Methiomne, N-Aspaiagi ne , 
P-Proline, Q-GluLamine, K=Arginine, 
.^Serine, T=Threonine, V=Valine, 
^-Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








mnjgelts pgaa yaglhni s pc/ms aaaaaaaaaaaygrsp w~g fd 
pj^inimrvpaippnltgipggkpaysfhvsadgomopvpfppdal 
3 g pg 1 p rharq 1 ktbnhge wcavt i snptrhvy tggkgcv k vw 
dishpgnkspvs0ldclnrdnyirscrllpd3rtlivggeastl 
s i whlaaptpr 1xae1tssapacyala1 spdskvcfsccsdgni 
a v wdlhnqtlvrqfoghtdgas c j dl sndgtklwtggldntvr s 
k\dlregrqlcx)hd/fftspvfslgycp\teewlavgmensn\v 
evlhvtkpdkyolhlhescvlslkfahcgkwfNvstgkdnllna 

W\RTPYG\ASIF\0SK2SSS\VbSCD3\SVDDKYlVTGS\GDK\ 
RATVYEVJY 


6034 


2G83 


714 


FS G RR R R bKRRKS PCPGTAGG PG ETN PG PGACPRG PREEAAAAM 
EI APOEAPPVPGADGDI EEAPAEAGSPSPASPPADGRLKAAAKR 
VTFPSDED1 VSGAVEPKDPMRUAONVTVDEV 1 GAYKQACQKbNC 
K Q 1 PXL LRQLOE FTDLGHRLDCLDLXG EKLDY KTCEALE EVFKJR 
LOFKWDLEOTNLDEDGASALFDMIEYYESATilLNl SFNKH1 G7 
RG WQ AAAHMMR KTS CLQY LADAR NTPbLDH SAPFV ARAL3 I R S S 
l^.VLHLENASLSGRPUIbLATALKMNKNbRE^YLNADNKLNGbO 
PSAObGNLLKFNCSLQIl.DbRKNHVLCSGLAYlCEGLXEORKGL 
VTb\VI,maOLTHTGI^rLGKTbPHTGSLETLNl/3HKPlGNEGV 
RKLKNGb 3 SNR S VLR LGLASTKLTCEGA VAVAEF 3 A ES PRXLR L 
DLR F.NE 1 XTGG LWALS LALKVNH S LLR LD LDRE P KKEAV KS F 3 E 
T0K ALLAE 1 QNGCKRN LVLARER EEXEQP PQLS ASM PETTATEP 
OPDDEPAAGVONGAPSPAPSPDSDSDGDSDGEEEEEEEGEREET 
P$GA1DTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFA1*AL>PPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEEbl.bEASOESGQETL 


6035 




404 


SVTYLG1 3 LHKNTGALPADPVQblSQTPTtSTK^bLSFLGMVG 
Y F Y LW I PG FAI bTKPI»CKbTKENbADA I DP KSFS HS S FRS LKTA 
L£NASTLALPUSSQPF\ S LHTA E VQGCWE 1 LTQGLGP.LPV 


6036 


1741- 


3S6 


LPDVEKliGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKbQRN 
SKGGQGRGVEKPPHLAALI LARGGSKGI PliKNI XHLAGVPLIGW 
Vl.RAALDSGAFOSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
SSTSLDAI 1 EFI^NYHNEVDlVGNIQATSPCUiPTDLQKVAEMIR 
EEGYDSVFSWRRHQFRWSEIQXGVREVTEPliJMLNPAKRPRROD 
WDG ELY ENGS FY FAKRHL I EMG YLQGGXMAY YEMRAEHS VEUDV 
C3DWPIAE0RV1.RYGYFGKEKLKEIKLLVCNIDGCLTNGH1YVS 
GDQKEIISYDVK1DAJGISLLKKSGIEVRLISERACSK0TLSSLK 
LDC KMEVS VS DXbAWDEWR KKMGbCWKEVAYI.GN EVSDEE CLK 
RVG bSGAPADACSTAQKAVGYlCKCNGGRGA\ 1 REFAEHI C\LL 
MEKGLI^FNPKNRNLAVNIGEKK 


6037 


2936 


i9i9 


WrxWWMCSVLTZbbFSbOGNKMLNYSAPSAGGybLPRKPVGTPA 
GC-GFPRRHSVTLPSSKFRONQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQXQPGGGQVNSSRYKT\ELCRPFEE*JGACKYG 
DKCQFAHGI HELRSLTRHPKYKTELCRTFHT1 GFCPYGPRCHFI 
HT^AEERJcALAGARDLSADRPRLQHSFSFAGFPSAAATAAATGLL 
DS PTS I TP PP I LS ADDLLGSPTLPDGTNNPF\AFSSOELAS LFA 
PSMGLPGGGSPTTFLFRPMSESPHMFDSPPSPODSLSIXJEGYLS 
SSS SSHSGSDS PTLDNSRRLP I FSRLS I SDD 


6038 


14 50 


426 


SSALQEr GTRNHTFGVPLPHRRKQI 1 SCNICQLRFNSDSQAAAH 
YKGT I01AKKLKALEAWKNKQKSVTAIOSAKTTFTS I TTNTI NTS 
SEKTiXTAGTPA I STTTTVEI RXSS VMrTEI TS KVEKSPTTATG 
NSSCPSTEreEBKAKRLL\YCSbCKVAVNSASOLEAHNSGTXHK 
TMI.FARNGSGT J KAFPRAG VKGXGPVNKGNTGLQNKTFHCE I CD 
VHVNSETOLKOHlSSRRHKDRAAGKPPKPXYSPYrlKbOKTAHPL 
GVKbVFSKEPSKPI^PRILPNPIAAAAAAAAVAVSSPFSLRTAP 
AATLFOTSALPPALLR PAPGP1RTAHTPVLFAPY 


6039 


4073 


1000 


LDEyEARLTU^XDDFEEDNEDDDENRVNOEEKAAXI TELI NXL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKTEDSFYK^SWPFKEVOTP0rX»NPFDEPEAFVTlKI)SPP0ST 
KXKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VNP VOELETERRVKR XAPAPP VLS PKTGVLNENTVS AGKDLSTS 
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2D 
NO: 


Preo! ttev. 
beg xr.n inc.- 
nuclf ot i c> 
locat ion 
corresponding 
to first 
amino acic 
residue ol 
amino scic 
sequence 


Predicted end 
nucleot idr 
1 ocat ion 
correspond mg 
to first 
amino acic 
residue oi 
amino acic 
sequence 


Amino acic r.egmcnt ccr.tain:^ s:cnol peptide 
(A=Alfnme. , OCycte : n<> , P-Aspartic Acid, E- 
Glytanvic Acid, F=Phenylalanine / G = Gl y::ne , 
H=Hist idine, I = Isolt\;cine, K=Lysinc 
L=Leucine, {^Methionine, N= Aspai aa: ne , 
P=Froline, OGJutnmine, R=Argimne, 
S=Serirte, T- Threonine , V=VaJme, 
W=Tryptophan, Y=Tyrosine, X^Unxncvn, *=5top 
Ccdon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








pkfspipspvlgrkpnasqsllvwckevtxnyrgvxi'j-nfttsw 
kngi>s fc a 3lv:kfr pdl.i dy kslnpqd 3 k enn k xay dg fas 1 g 1 
5 rlueps dm v l^ai pdkltvm jt lyq i rah fsgceuvwq 1 een 
sskstykvgnyfj-dtnssvdoekfyaelsdlkrfpeloqpisga 
vdflsoddsvfvndsgvgesesehqtpddhlspstaspycrrtk 
sdtepqksqqssgr7sgsddfg3csntd5tqaqvllgkxrllxa 
et1.elsdlyvf/jkkkdmsppf1cf.etde0klctld1gsnlekek 
lensrslecrsdpespikxtsl.sptsklgysysrdldlakxkha 
6lrotesdpdadrttbnhadhsskiv0hrllisr0eelkerarvl 
1 ,etf ar3caalxagnkwtntaa pfcnrcls dqqdeer rrqlrer 
arquiafj^rsggkmselpsygeraaeklkerskasgdendniei 
etnefjpegfvvgggdeltnlendldtpeqwsxlvdi.klkklle 
vqpqva^spssaaqxavtesseqdmxsgtedlrterlqxtterf 
rn pwfskpstvr ktqlqsfsqy 1 enrpehxrqrs1 oedtkkgn 
eekaairetork?sedevlnkgfkds\sqyvvgelaaa»eneqkq 

I DTRAALVEKJRLRY1jMDTGRNTEEEEAMM0EWPT4LVNKKNALIR 
R^NOLSLjI jEKEHDLERRYELUNRELRA v U.J\3 EDWQKTEAQKRRE 
QLLLDE1A7ALV70 KJ?DALVKD1 j;AOEXQAEEEDEHLERTLEQNXG 
KMAK.KEE KCV LC 


6040 


475 


J052 


PTALMT A PSCAT P VQFRQPS VSGLSQ I T KS LY 3 SNG VAANNKLM 
LSCNQITMV2NVSVEWNTLYEDIQYM0VPVADSPK T SKLCDFFD 

? I ADH I H s vemkogr\tllh C AAGVSR s aalclay lmkyhamsl 

LDAKTWTKSCRF3 I RPNSGFWEQLIHYEFOLFGKNTVHKVSSPV 
GMIPDIYEKEVKIjMI PL 


6041 


2 


^88 6 


te kde ktahn le n v l j h fwer ls e i cva k 3 se p ead vds vlg vs 
kllcvlokpkgslksskkkngkvrfadeilesnkehekcvsssg 
ekiecwel.ttepslthnssgllsplrkkpledlvcklad1siny 
vnerkseqhlrflstlldsfsfsrvfk^lgcekqsivqakple 
iaklvqknpavoflyckxigwlnedqrxdfgflvpilysalrcc 

DNW4ERKKVI,DDLTKVDLKVmSLbKl 3 EKACPSSDKHAiVTPWL 
KGDI I^EKLVm^CLCTCEDLESRVSSESHFSERWTLJ .SLVLSQ 
HVKNDYLI GDVTVER2I VRLHETLFKTKK3iSEAESSD£SVSFIC 
DVAYNYFSSAKGCLbMPSSEDLLLTLFC-LCAQSXEXTHLPDFLl 
CKLKNTWIkSGVNLLVHQTDSSYKESTPLHLSALWLKNOVOASSL 
01 NSLQVLLSAVDDLLNTLLES EDSYLMCVY 3 GS VKPNDSEWEX 
MR0SbPMQWLHRPLI J EGRLSI J NYE;CFKTDFKEOD3XTLPSKI»CT 
S ALLSXMVL3 ALR K ETVLENNELSKI I AELLYSLOWCERLDN PP 
IFLIGFCE 3 LQKKK 1 TYDNLRVLGNMSGLLQLLFNRSRSHGTLW 
SL3 IAKLI LSRS3 SSDEVKPHYXRKESFFPLTEGNLHTIOSLCP 
Fl.S KEEKK EFSAQC3 PALLGWTK KDLCSTNGGFGHLA I FNSCXQ 
TKS 3 DDGELLHG3 l.X 3 II SWXKEHED1 FLFSCNLSEAS PEVLGV 
Nl EI IRFLSL?bXYCSSPLAESEWDFIMCSMLAWLETTsENOAL 
YS 1 PLVQL FAC VS CDLACDLS AFFDS TTLDT 3 GNLPVNL3 SEWX 
EFFS0G3KSLLLP I LVTVTGENXDVSETSFONAMLXPMCETLTY 
3SKEQLLSHXLPARLVADQKTNLPEYLCT3j1jNTLAPIjL»LFRARP 
VQ 3 A V YHM L YKLM PE LPQYDQDKLXS YGDE EEEPALSP PAALMS 
LLS 1 0EDLLENVLGC 3 PVGQIVT 2 X PLSEDFC Y VLG YLLTWKLI 
LTFFXAASSOLRALYSMYLRXTKSLNKLLyHLFRLMPENPTYAB 
TAVEVPNKDP KTF PTEELQ1>S 3 RETTMLPYH I PHLACS V Y HMTL 
XDLPAMVRLWWN?SEXRVFNIVDRFTSKYVSSVLSF0E3SSVQT 
STOLFNGKTVXAR-ATTREVMAT Y T3 EDI V3 ELI 1 0LPSNY PLGS 
1 3 VESGXRVGVAV OOWRNWML0LSTYLTHQNGS 3 MEGLALWKNN 
VDKRFEGVEDCM1CFSVIHGFNYSLPKXACRTCXKXFKSA\CLY 
XWFTSSNKSTCSLCRETFF 


6042 


1306 


253 


MAELAPASPSDIXASVSNGDTTLLCSRROSCGMNEVROVSLTYP 
GSFAPSHSLPLQPRSGGSLCPSRAW/PPPHQLFDDTSSAQSRGY 
GAQRAPGGLSYPi4ASPTP3^AAPlJa)PVSlW>lAyGSS1^0^KE 
LVEKNlDRFIP17KLKYYFAVI>TKYVGRKJLfGLLFFPYLH0DWEIV 
QYCODTPVAPRFPVNAPDLYIPAMAFITYVLVAGLAIXSTQDRFS 
PDLLGLQAS SALAWLTLEVLA1 LLS LYLVTVNTDLTTI DLVAFL 
GyKYVGMJGGVLKGLLFGKIGYYLVLGWCCVAlFVFMIRTLRLK 
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SEQ 
ID 
NO: 


Predicteo 
beginninc 
nuciect ide 
locatior, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
anino acid 
sequence 


An-.ino acid segment cont^:n^ng signal peptide 
(A^Alaninc, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F-fheny] ani nc , G=Glycine, 
H-Histidine, I>=Iscleucine , K^Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P = Prcline, Q=Glut amine, R=Axginine. 
S^Serine, T=Threonine, v.- Valine/ 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-rpossible nucleotide insertion) 








1 LADAAAEGV PVRGARNOLRMYLTMAVAAAOPMLMY WLTFHLV R 


6043 


4 01- 


$99 


LCLFFF F PCATPVL PLPS L1SA1 ,/eLSHLSVSSUTCPCQPPLPC~ 
PLPPLGNKTAKGSLSTEQSERG 


6044 




412 


KXEMU^FTUSKVKISREVTMIASKPGIGQOVRHSLLGYLJGVVV 
DIDPVYSLSEPSPDEIAVNDELRAAPWYHVVMEDDNGLPVHTYL 
AEAQLSS ELQDEHP \EQPSMDELAOTIR KQLQAPRLRN 


6045 


155 


2299 


SPLPQVAAMNYLRRRLSDSKFMANLPKGYMTDLORPOPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFSEOVGGGSGGAGRGGAASRVLLVIDEPHT 
DWAK YFXGKK I HGB2D1 K VEQA E FS DLNLVAHANGG FS V DME Vl» 
RNGVKWRSLKPDFVL1RQHAFSI4ARNGDYRSLV3GLQYAGIPS 
VNSLHSVYNFCDKPWVFAQNSVRLHKKIjGTEEKPL.IDQTFYPNHK 
BMLS S\TTYP VWKMGHGTLWGWC- KVWDNOHDFQDI AS VV ALT 
KTYATAEPFIDAKYDVKVOKIGOKYKAYMRTSVSGNWKTNTGSA 
MLE0 1 AMSDR YKLWVDTCSE I FGGLI") I CAVEALHGKDGRDHI I E 
WGS SMPL 1GDHQDEDK 0L I VELV'.r^XMAOALPROROR DAS PGR 
GS HGOTP S PG ALPLGRQTSQQ P AGP P AQORPP PQGGP PQ PG PG P 
QR QG P PLQQR PP POG00HLSGLGP P ACS PLPQRLPS PTS APQQ P 
AS QAA PPTQGQGRQSRPVAGG PGAF PAAR P PAS PS PQRQAGPPQ 
ATUQTSVSGPAPPKASGAPPGGQQROGPP0KPPGPAGPTRQASQ 
AG PV P? TG P P TTQ0 PRPSG PG PAGR ? KPOLAOKPSOD VP P PATA 
AAGGPPHPQLNKSOSLTNAFNLPEPAPFRPSLSODEVKAETIRS 
IRKSFASLFSD 


6046 


21i" 


1075 


FGLTGPCERVPFLLGRGPPiJGATRAGHRRAVRWAGPESLPPLPR 
SLlMDSPRAGTHQGPLDAETEVG/iDRCTSTAYOEQRPQVEQVGK 
QAP1>S?GLPAMGGPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCA 
FTVALRARRGADLSSLRALLG0ALFHO\A0LGOLSYLAPGEDGH 
WVPI PE EES LQRAWQDAAACPRG LQLQCRGAGGR P VL YQ WAQH 
S Y S AQG P EDLG F RQGDT V D V LCEV D C AW LEG HCDGR IG1FPKCF 
W PAG PRMSGAPGRLPR SGQGXIQP 


6047 


49 


1405 


PVLVTSLRMREADtLRPPOLMEVSADIISTVEFNHTGELLATGD 
KGGRWIFOREPESKNAPKSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKI KWLPQQNAAHS LLSTNDKTI KLWK 1 TERDKR PEGYN 
LKDEEG KLKDLSTVTSLQV P VLKPKD LWVEVS P RR I FAKGHTYH 
1NS1SVNSDCETYHSADD1,RINLWHLA1TDRSFTP\NIVDIKPA 
NMEDLTEVITASEFHPHHCNLFVYSS'SKGSLRLCDMRAAALCDK 
HSKI>FEEPEDPSNRSFFSEIIS\SV?DVKFSHSDRYM1,TR\DYL 
TVIO^DLVNMEARPIETYOVHDYLRSKLCSLYENDCIFDKFECA 
WNG S DS V I MTG A \ YNN FFR M FDRNT K RD VTL \ EASRESSKP RA V 
LKFRRVCVGGKRRRDDISVDSLDFTFJCILHTAWHPAENI1AIAA 
TWLY 1 FQDKVNSDMH 


6048 


1 


3194 


G1RTPKFCDSPTSPLEMRNGRGRGKP.MRPNSNTPVNETATASDS 
KGTSNS S KTRAGANSKGRRGSQNS5 EHR P PASS TS EDVXAS PSS 
ANKRKNKPUSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
EPTVLDRNCPS P VL I DCPH PNCNKK Y KH I NGLXYHQAKAHTDDD 
SKPEADGDSEYGEEPILHADLGSCNG\A£VSQK\GSLSPARSAT 
PKVRLVEPHSPS PS S KFSTKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 

SARPI /APIJVI ppqqi ytfqtatft^spgsssgltatvaoamp 

NSPQLKPI QPKPTVMGEPFTVNPALT PAKPKKKKDKKKKESSKE 
IiESPLTPGKVCRAEEGKSPFRESSG^GMKMEGbLNGSSDPHQSR 
1ASI KAEAPKI YS FTDNAPSPS IGGS SRLEMTTPTQPLTPLHW 
TOMGAEASS VKTNS PAYSD1 SDAGErGEGXVDSVKS KDAEQLV X 
EGAXKTLFP PQPCS KDSPYYOGFESY YSPS YA0SS PGALNPSSQ 
AGVESOALKTKRDEEPES1EGKVKNB1CEEKKPELSSSSQOPSV 
I OORPKWYMQSLY YNQYAY VP PYGYSDQS YHTHLLSTNTAYRQQ 
YEEQQKRQS LEQQQRGVDKKAEMGLKEREAALKEEWKOKPS I P P 
TLTKAPSLTDLVKSGPGKAKE PGADPAKS VI I PKLDDSSKLPGQ 
APEGLKVKLSDASHLSKEAS EAKTGAECGRQAEMDP I LWYRQEA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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SEQ 
ID 
NO; 

1 


Predi cted 
beginning 
nvcl eotide 
3 oca t j on 
corresponding 
tc first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=A2anine, C=Cysteine, D«-A3partic Acid, E- 
Glutanic Acid, F=phenyl al ani ne, G=Glycine, 
H = Histidine , I = lsol eucine , K=Lysine, 
L«=Leucme, M=Methionine , N-Asparagine , 
P-Proline, Q=Glut£mine, R=Arginine, 
S^Serine, T- Threonine, V=Valine, 
WcTryptophan, Y=Tyro$ine, X=Unknown, *=Stop 
Codcn, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








[^DGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY ~ 
lPYMHGYSYSQSYDPNHPSYRStfPA\70MQNYPGSYLPSSYSFSP 
YGSKVSGGEDAJDKARASPSVTCKSS5ESKALDILQQHASHYKSK 
SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 
SPSQRLMSTHHHHHHLGYSLLPA0YKLPYAAGLSSTAIVAS0OG 
STPSLYPPP2R 


6049 


215 


108? 


AMT3VFDRRVPS1RSGDFCAPF0TSA7\MHHPS0ESPTLPESSAT 
DSDYYSP?GGAPKGYCSPTSASYG\KALNPYQYQYHGVNGSAGS 
yPAKAYADY$YA£SYH0YGGAYNRVF5ATNQPEKEVTEPEVRMV 
NGKPKKVRKPRT1YSSFOLAALQRRFOKT0YLALPERAELAASL 
G LTQTCV K I W FQNKRS K I K K I M JONfGEN! P PEHS PSSS DPMACKS P 
0SPAVWEPOGSSRSLSHHPHAHPPTSNOSPASSYLENSASWYTS 
AASS3NSHLPPPGSLQHPLALASGTLY 


6050 


566 


1718 


KGLHRTCCAMEESDSEKTTEKENLrGPKMDPPLGEPGXGSLGWVlj 
PNTAMKKKVLLMGK5GSGKTSMRS 1 1 1 AM Y 1 ARDTRR LG AT I LD 
RIHSLQlNSSLSTYSbVDSVGNTKTFtVEHSHVRFLGNLVLNLW 
DCGG0DTFMENYFTS0RDNIFRNVEV1J Y VFDVESRELEKDMHY 
YpSCLEAI LQNSPDAKI FCLVH KMDLVQEDQRDLI FKEREEDLR 
KLSRPLECSCFRTS 1 WDETLYKAWS 3 1 VYQL3 PNVQQLEMNLRN 
TAEI IEADEVLLFERATFLV3SHYQCKEQRDAHRFEK3SNI3KQ 
FKLSCSKLAASFQSMEVRNSNFAAFI Dl FTSNTYVMWMSDPS I 
PS AATL1K 3 RNARKKFEKLERVDGPKQC1.-LMR 


6051 


566 


1718 


KGt»ERTCCAMEESDSEKTTEKENIjG?RKDPPLGEPG\GSliGWVL 
PNTAI1KKKVLLMGKSGSGKTSMRSI I FANYIAKDTRRLGAT3LD 
K 1 HSLQ3N SS LSTY SLVDS VGNTKTFDV EHSHVRFLGNLVLNLW 
DCGGQDTFMENY F7SQR DN I FRNVEVI J Y VFDVESRELEKDMHY 
YQSCLEA3 LQNSPDAKI FCLVH KMDLVCEDQRDL3 FKEREEDLR 
RLS R PJLECS CFRTS 1 WDET1.YXAWS S 3 VY 0L3 PNVQQLEMNLRN 
FAEI I EADEVLLFERATFLV3 SHYOCXECRDAHR FEKI SN3 3KQ 
FKLSCSKLAASFQSMEVRNSNFAAFID2 FTSNTYVMWMSDPSI 
PSAATL3NIRNARKHFEXLERVDGPKOCLLMR 


6052 


566 


1718 


KGLERTCCAMEESDSEKTT£KENLGPRMnPPLGEPG\GSLGWVL ! 
PNTAMKX KVLLMG X SGS G K TSMR S 3 1 FANY3 AR DTK RLG AT I LD | 
MHSLQINSS LSTY SLVDS VGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENY FTSQRDN I r RNVEVL I YVFDVESRELEKDMHY 
YQSCLEAIL0NSPPAK3 FCLVH KMDLVQEDORDLI FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQL1 PNVQQLEMNLRN 
FAE H EADFJVLLFERATFLVT S HY QCKEORDAHRFE KX SN 1 1 KQ 
FKLSCSKLAAS FQSMEVRN5NFAAF I D I FTSNTYVMWMSDPS I 
PSAA.TLIN 3 RNAR KHFEKLER VDG PXQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQSRFRHGRRSHQQNPWFRLRDSEDRSDSRAAOPA 
HDSGKGDDESPSTSSGTAGTSSVPELPGFYFDPEKKRVFRLLPG 
HNNCNPLTKESlROKEMESKRLRLiCEEDRRKKlARMGFNASSM 
LRKSQLGFLjNVTN YCHLAKELR lscme r kkvqirs mdps alasd 
RFNL3LADTNSDRLFTVNDVTVGGSKYGI INLOSLKTPTLKVFM 
HBNLYFTNR KV\NSVCWAS LNHLDSH 3 LLCLMGLAETPGCATLL 
PASLFVNSHPAG3 DRPG\MLCS FR I FGANSCAWSLNIQANNCFS 
TGLSRRVLLTNWTGHRQSFGTNSDVLACQFALMAPLLFNGCRS 
G23 FAI DLRCGNOGKGWKATRLFHDSAVTS VRILODEOYLKASD 
tfAGKI WLW DLRTT KCVRQ Y EGHVN EY A Y L PLHVHEE EG I LVAVG 
0DCYTRI WS LHDARLLRTI PS P YPASKAD I PS VAFS SRLGGS RG 
APGLLKAVGODLYCYSYS 


60S4 


1 


1054 


PPIARl^EFGTSRPJiMAAPSGVHLLVRRGSHRIFSSPLNHIYLK 
KQSSSQQRRNFFFRRQRD3 SHS I VLPAAVSSAHPVPKHI KKPDY 
VTTG I VP DWGDS 3 EVKNEDQ3 OGLHQAC0LARHVLLLAGKSLKV 
DMTTEEI DAXiVHREI3 SHNAYPS PLGYGGFPKSVCTSVNNVLCH 
GI PDSRPLQJDGD3 3 N 1 D VTVYYNG Y HGDTSETFLVGNVDECGKK 
LVEVARRCRDEA3AACRAGAPFSVI GNT3SH1 THQNG FQ VCPHF 
VGHG1GSYFHGHPE3WHHANDSDLPMEEGMAFT3EPI3TEGSPE 
FKVL EDAWTWS LD / TS JCVS AQ FEH TVL 1 TS RGAQ3 LTKLPHEA 
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SEC 
IP 
NO: 


Predicted 

beginning 

nucleotide 

lecatior, 

ccr responding 

tc first 

arrino acid 

residue cf 

amino acid 

sequence 


Predicted end 1 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
emino acid 
sequence 


Amino acid segment conr amino signal peptide" 
;A=Alamne, CsCysteine, D=Aspartic Acid, E= 
Glntatr.ic Acid. F=Phenylalanine, G=Glycinc, 
H-Histidine, I=-3solc-\jcinc, X=Lysine, 
1, -Leucine, M=Methionine , N=Ar>paragine, 
?-Proline, Q-Glutamine, R=Arginine, 
S=- Serine, 3>Thieonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 


2364 


ppyfllsflawmlygosdrtetbjsosagpppgtl^csalhhdp 
ccancsrfcrdcsppacqchthvfpgnalngvqppei>srtlal! 
£srbpprkkkksqtetgkerertsfltqggkrfelqhglagicm 
tlbltgds i vs aem^dhvtmanrelafkagdvikvldasnkow 
wwgqiddeegnfpasfvrlwvnhedeveegpsdvonghldpnsd 
clclgr plqnrdqmranv jneimsterhyj khlkd i cegylkqc 
r k r rdmfsdeqlkv i fgn ied1yr fomgfvrdlekoynndd pkl 
s 1 1 gpcflehqdgfw 1 y s ey cnnh l3acmelsklmkdsryqhff 
eacrlloonidiaNidgflltpvokickyplolaellkytaodh 
£dyr yvaaaiavmrnvtqoi nerkrrlekl dki aowqasvldke 
g£dildrssel»iytgem^wi y0p\ygrnqqrvfflfdhomvbck 
kdlirrdilyykgridndxyewd1edgrdddfnvsnknafklh 
nketeeihlffakkleekirwlrakreerkmvqedex1gfeise 
nqkrqaamtvrkvpkokgvnsarsvppsypppodplkhgoylvp 
\dgiaq5qvfeftepkrs0spfw0nfsrltpfkk 


6056 


43 


3358 


SCGRGPVRVRSEQLSPSAEQVSQJSQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALT3MEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAY MESKG AHRAGLAKV 1 P PKEWKPRQCYDDI DNLLI PAP 1 00 
MYTG QS GLFTQ YN I QK KAMT VXE F RQLANSGKY CT PR YLD Y KDL 
ETRKYWKNLTFVAP I YGADI NGSI YDEGVDEWNI ARLNTVLDWK 
EECGISIEGVNTPYLYPGMWKTTFAWHTEDMDLYSINYLHFGEP 
KSV1 Y A I ? PEKGKRLER LAQG F FP S S SQGCDAFLRH KMTL3 S PS V 
LKKYGI ?FDK1 TOEAGE FMI TFPYGYHAGFNHGFNCAESTK FAT 
VR W I D YGK VAKXCTCR K DMVK ISMD3 F VR KF0PDR YQLWXOG KD 
I YT3 DHTKPTPASTPEVKAWLQRRRKVRKASRSFOCARSTSKR P 
K/DEEEEVSDEVDGAEVPNPDSVTDDI>KVSEKSEAAVKLRNTEA 
SSEEESSASRMOVE0NLSDH3KLSGNSCLSTSVTEDIKTEDDKA 
YAYRS VPS ISSEADDS I PLSTGYEKPEXSDPS ELS WPKS PESOS 
S VAE SNG VbTEGEESDVES HGNGLE PGEI PAVPSG ERNSFKVPS 
lAr.GENKTSKSWRHPLSRPPARSPMTLVKOOAPSDEELPEVLSI 
EHEVEETESWAXPL1HLW0TKPPNFAAE0EYNATVARMKPHCAI 
CTLLMPYHKPDSSNEENDARWErKLDEWTSEGKTKPljlPEMCF 
I YSEENIEYSPPNAFLEEDGTSLL1 SCAKCCVRVHASCYGI PSH 
E1CDGWLCARCKKNAWTAECCLCNLRGGALKQTKNNKWAH\T^CA 
VAVPBWFTNVPERTQIDVGR3 PL0RLXLKC3 FCRHRVKRVSGA 
C I OCS YGRCPAS FKVTCAKAAG VIA MEPDDWP Y WN I TCFR H KV 
N PNVXS KACEKV ISVGCTV 3 TKHRNTR Y YSCR VMA VTSQTF YEV 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AXYFGSNI AHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCOVNSl>SSPHVS0A0GETYLGFWlNSKXSCCNIF 
LSGTY 


60S7 


1 


653 


FVAiU>KEQEGEGGLGPRK£KGRARGRERRRKMQUTRCCFVFLVQ 
GS LYLV1 CGQDDGPPGSEDPERDDHEGQPRPRVPRXRGH1 S PXS 
R PfWJSTLLGLLAPPGEAWG 3 LGQPPNRPNHSPPPSAXVXX 3 FG 
WGUFYSNI KTVALNLLVTGX I VDHGNGTFSVHFQHNATGQGN I S 
1 S LVPPSKAVEFHQE00I Fl EAXASX3 FNC\RMEWEKVE\RGRR 
TSLFTHDPAXICSRDHA0SSATWSCSCPFXVVCVY3AFYSTDYR 
LVOKVCPDYNYHSDTPYYPSG 


6058 


1 


966 


HPLPSASU3LPSVSI^VSLCVRSALLEAVV?KI*PKRRRARVGSP 
SGDAASSTPPSTRF1?GVA3YLVEPRMGRSRRAFLTGLAJISKGFR 
VLDACSSEATH WnEETSAEEAVSWjfiJ^KPIAAAPl'Ov- 1 VVALiLtD 
ISWLTESLGAGQPVPVECRHRLEVAGPSXGPLSPAWMPAYACQR 
PTPLTHHWTGLSEALE I IAEAAGFEG SEGRLLTFCRAASVLXAL 
PSPVTTXSQLC^LPHFGEHSSRVVOEbLEHGVCEEVERVRRSE/ 
RLFTQIFGVGVXTADRWYREGLRTLDDLREOP0XLTCXX2XAGEP 
S R E AGPWAS LNCTLDP S AS TP 


6059 


2 


3650 


OODFESLADLTDHRAH 8 C PGDGDDDPQLS WVAS SPSS KDVASPT 
ON J GDGCDLGLGEEEGGTGLPy PCGFCDKSFI RLSYLKRHEQI H 
SDKO.PFKCTYCSRLFKWKRSRDRJ1IKLHTGDKXYHCHECEAAFS 
R S DH LKIHLKTHSSS KP F X CTVCKRG F£ STS S IiQSHMQAHXKN K 
EH LAXS EXEAXXDDFMCDY CEDTFSQTEELEKHVLTRKPQLS E K 
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ID 

NO: 


Piedict ed 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
iiuc. eotide 
loci: 'i i on 
cor : expending 
to i i rst 
amiro acid 
residue of 
omino acid 
sequence 


Amino acid segment containing signal peptide ] 
(A=Alanine f OCysteine, n=Aspartic Acid, E= 
Glutamic Acid, F^Pnenylalanine , G=Glycine, 
H=Histidine, l=Iso!eucine, K«=Lysine, 
L=L»eucine, K=Methicn3 ne , N=Asparagine , 
P=Proline, Q«=Glutam:ine, R=Arginine, 
S-Serine, T- Threonine , V=Valine, 
W=Tryptophan, Y=Tyrcsine, X=Unknovm, *sStop 
Codon, /-possible nucleotide deletion, 1 
\=possible nucleotide insertion) 








adlqc1hcpe vf vdentllah 1 hqahanqkh xcpmcpe \qfs s v ! 
\egvychldshropdssnhsvspdpvlgsvasmssatpdssasv 
ergstpdstiikplrgokkmnddgog'wtkwyscpycsxrdfnsl 
avleihlkti11adkpooshtc01cldsmptlyklnehvrklhkk \ 
haypvmqfgnisafhcnycpemfadinsloehirvshcgpnanp j 
sdgtn^affcnocskgfltessltehiqXqXahcsvgsaklespv 

VQPTQSFMEVYSCPYCTNSP1 FGS ILKLTKHI KENHKNIPLAKS > 
KKSKAEQSPVSSDVEVSSPKRORLSASANSISWGEYPCNQCDLK 
F SN FESFQTHliXLHLEL>LLR KO AC PQCKED FDSQESLLQHLTVH 
YMTTSTH YVC ES C DKO FS S VDD \ LQKi 1\ LLDMPH PLCCTH CT\ L 
C0EVFDS \KVS I \QVHLAVKH£NE KKMYRCTACNWDFRKEADLC 
VHVKHSHIX5NPAKAHKC I FCGETFSTBVELQCHITTKSKKYNCK 
FCS KAFHA1 ILLEKHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLOGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLONHRIiRDHNI RPGEDnGSRKKAEFI KGSHKCNVCSRTFFSE 
NGLREHLQTHRG PAXHYMCP1 CGERFPSLLTLTEHKVTHS KSUD 
TGTCR I CKMPLOS EEE F I EHCCMH PDLRNS LTG FRCWCMQTVT 
STLELKIHGTFRMOKLAGS SAASS PNGQGLOKLYKCALCLKEFR 
SKQDLVKJLD VNGLPYGLCAGCMAR S ANGQVGGLAPPEPADR PCA 
GLRCPECSVKFESAEDLESHMOVDHRDLTPETSGPRKGTQTSPV 
PRKKTYOCIKCQMTFENEREIOlHVANlIMIEEGimiECKLCNOM > 
FX)SPAKLLCHL,IEHSFEGMGG'XFKCPVCFTVFVUANK.LQQH1FA » 
VHGOEDKIYDCSQCPCKFFFQTELQNHTMSQHAQ 


6060 


2145 


! 20? 


SYElVGKNKLEVNUSOLKALCKCSLPSRLLPLGENLPLLDRGFR 
KEPRSRGSRERDNMl iHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
JDI SASRPNI LLLMAJDDLG 1 GD 1 GCYGNHTMRTPN I DRLAEDGVK 
LTCK I SAASLCTPS RAAFLTG RY PVRSGMVSS 1 G YR VLQWTGAS 
GGLPTNETTFAKILEEKGYATGLIGKWHLGLNCESASDHCHHPL 
HHGFDHF YGK P FS LMGDCAR W ELSEKRVNLEOKLNFLFOVLA^V 
ALTLVAGKLTHLI PVSWMPVI WSAl^AVLLLASSYFVGALI VHA 

dcflmrnht1teqpmcfqrttplilcevasflkrnkh3ppllfv 
s flhvhi pli tmenflgks lhglygd7^vkemdwmvgr1ldtldv 
eglsnstll y fts dhggslenqlvgntqyggwng i ykggkgmgg w 
egg i rvpgi frwpg vlpagr vi geptslmdvfpt wrlagsevp 
odrv 1 dgqdlijpll,bgtaqhsdheflmhycerfi*haarwhordr 
gtmwkvhfvtpvfopegagacygrkvcpcfgekvvhhdppllfd 
lsrdpsethiltpasepvfyovmerNvooavwehortlspvplo 

LDRlGN I WR PWLQ P CCG P F P bCWCLR EDDPQ 


6061 


no 


I33U 


MNI HN5KR K.TI KK I NTFENRHLMLDGMPAVRVKTELLES EQG S PN 
VHKY PDM EA VPLLLIWVKG E P PEDSLS VDHFQTOTE PVDI>S INK 
ARTSPTAVSSSPVSMTASASS PSSTSTSSSSSSRLASSPTVI TS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLO 
SNKLSHVHR I PVWOSVPWYTAVRSPGNVNNTl WPLLEPGRG 
HGKAQMDPRGLSPRQS KSDS DDDDLPNVTLDS VNETGSTAJ,S I A 
RAVQFJVH PS PVSRVRGNRMNNOKFPCS I SPFS3 ESTRRQRTVLN 
PPDSKKTAxb ITJLDr JCbdoJ'OKVJlKK lMlbtAr X 

KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 


71 


107S 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT 
L JVLFWGS KHFWPEVPKKAYDMEHTFYSNGEKKKI YME IDPVTR 
TEIFRSGNGTDETLEVHDFKNGYTGIYFVGLQKCFIKTQI KVI P 
EFSEPEEEIPENEEITTTFFEOSVIWVPAEKPIENRDFLKNSKI 
LElCDNVTMYW\IKPTL\ISGTFAXOLHHNFAFIILVSELQDFE 
EEGEDLHFPANEKKG I EQNEOWWPQVKVEKTRHARQASEEELP 
INDYTENG I EFDPMLDERG YCCI YCRRGNRYCRRVCEPLbGY YP 
YPYCYCGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNGP ENCEDCH I LNAEAFKS KK ICKSLKI CGIiVFGI LALT 
LI VLFWS KHFWPEVPKKAYDMEHTFYSNGEKKKI YMEIDPVTR 
TE I FR SGNGTDRTLB VHBFKNG YTG I YFVGLQKCFI KTQ I KVI P 
EFSE?EEE1DENEEITTTFFEC?SVIWVPAEKPIENRDFLKNSKI 
LB1 CDNVTW W\ INPTlA 1 SGTFAKQLHHNFAF 1 1 LVSELQDFE 
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BNSDOCID: <WO 01S331?A1_.I_ ? 



WO 01/53312 



PCT/USOO/34263 



ID 
KO : 


Predicted 
beginning 
nucl eot i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eot 1 de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segment containing signal peptide 
(A=Alanine, OCycteine, D^Aspartic Acid, E=. 
Glutamic Acid, F=Phcnylalanine, G-Glycine, 
K=Histidine, I =Isol eucine , K= Lysine, 
I^Leucine, ^Methionine, N=Asparagine, 
P=Proline, C = Glu.tamine , R^Arginine, 
S=Serine, T='Jhreom ne , V= valine, 
W=Tryptcphar., Y^Tyrosine, x-Unknown, * = Stop 
Codon, /=possible nucleotide deletion, 
\=poseible nucleotide insertion) 








EEGBDLHFPANEKKGlEONEOWWPC-VKVEKTRHAROASEEELP 
rNDriENGIEFPFKLDERGYCCIYCRRGNRYCRRVCEPLLGYYP 
yPYCYCGGRVJCRVI MPCTWWVARMLGRV 


6064 


913 


313 


nlpqslpr ptekh f p y sle kmtdlvav wdvalsdgvh k 1 efehg 
ttsgkrwyvdgkeeirkewmfklvgketfyvgaaktkatinid 
aisgfayeytle:ngkslkkymedrskttntwvlhmdgenfriv 
leoamdvwcngkkiietagefvddgtethfs igthnacy i kav\ 

SSG\KKKEGI 1HTLI VDNREIPEIAS 


606S 


1153 


64i 


MSVRVARVAWVRGLGASYRRGASSFPVPPPGAQGVAELLRDATG 
AEEEAFWAATERRKPGCCSVLLFPGQGSQVVGKGRGLLNYPRVR 
EL YAAAR R VLG YDLLELS L) IG PQETLDRT VHCQ PAI FVAS LAAV 
2KLHHLQPSVIENCVAAAGFSVGEFAALVFAGAMEFAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVR1W 
RDLDDDDPKF1NVGEKAYSCALKSGKLVTAVSNNT30VHTFPEG 
VPDG I LTRFTFNANHVVFNGDGTK I PAGSSD\ KLVK 3 VDVMDS £ 
OOXT?RGKDAPV!>SLSFDPKDI FLASASCDGSVRVWOI SDOTCA 
3 S WPLI >QKCNDV J NA KS I CR1AWQPXSGKLLAI P VEKS VKLYRR 
FS WSH0FDLSDNF3S0TLNI VTKSPCGQYLAAGSINGLI I VWNV 
ETKDCMEKVKHEKGYA3CGLAKHPTCGRISYTDAEGNLGLLENV 
CDPSGXTSSSKVSSRVEKDYNDLFD6DDMSHAGDFLNDHAVEIP 
SFSKGIINDDEDDEDLMMASGRPRORSHILEDDENSVmSMLKT 
GSSlibXEEEEDGCEGSIHNLPliVTSORPFYDGPMPTPROKPFOS 
GSTPLHLTHRFMVWNS 3GI IRCY7TOEQDNA1 DVEFHDTS3 HHAT 
HLSNTLtJYTlADLSHEAILLACESTDELASKLHCLHFSSWDSSK 
EK1 3DL-PQNEDI EA1 CLGOGWAAAATS ALLLRLFTIGGVQKEVF 
SLAGPWSMAGHGE0LFJVYHRGTGFDGDQCIXJVQLLEIX5KKKK 
03 LHGDPLPLTR KS Y liAW I GFSAECTPC YVDSEG I VRMLNR GLG 

MTUTD T r v, MTl3PUr , vr*VCnUVWV7Ur'7 UCMDrV^T.PPT PfKf^P PPP 
Nlnlrl v.W 1 riorH. I\\s Un X n V VKi 1 Mt»Wr^JX»K^_ J. rV, i^vO Kfrr 

TLPRPAVAILSFKLPYC03ATEKG0MEEOFWRSVlF:JNKLDYLA 
KNG YE Y E ESTKNQATKEQQELLMKMIALSCKLER E FR CVEI JVDL 
MTONAVWLAIXYASRSRKLILA0KLSELAVEKAAELTAT0VEEE 
EEEEPFRKKLNAGYSNTATEWSQPRFRNQVEEDAEDSGEADDEE 
KPE 3 H K PGONS FS KS TNS S D VS AX$ G A VT FS S QGR VN P FKVSA S 
SKEPAM.SMMSARSTKILDNMGKSSKKSTALSRTTNNEK5PIIKP 
LI P KP K P KQ AS AA S Y FQKRNS QTNKT EEV KEEN LKNV1 )S ET PA 1 
CPPCNTENQRPKTGFQMWLEENRSN3 LSDNPDFSDEAD1 3 KEGM 
I5FRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAK ENLNLS K KQKPLD FSTNQKLS AFAFKQE 


6067 


858 


321 


LPWQRI^VLUSRGKJ^VTGWIjESUITAQKTAIJL^DGRRKVHYLF 
PIX3KEMA£EYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLGPELIKESNANPIFMRKDTKMSFQWR3RNLPYPKDVYSV 
SVDQKERCI 1 VRTTKKKYYKKFS2 PDLDRHQ LPLDDALLS FA\ T 
PTAP 


6068 


13 


1730 


GSKWADlJ^^EEKPA3APPVFVF0KDKGOKSPAEQKNLSDSGEEP 
RGEAEAPKHGTGK PES AGEHALEPPAPAGASASTP PPPAPEA0L 
PPFPREliAGRSAGGSSPEGGEDSDREEGNYCPPVKRERTSSLTQ 
FP P SQSEE R S SGFR L K P PTLI HGQAPS AG LPSQKP XEQQ R S VLR 
PAVLQAPCPKALSOTVPSSGTTJGVSLPAIXrrGAVPAAS PDTAAW 
RSPSEAAJ)EVCm:EEK£P0KNESSKASEBEACEK2a»PAT00AFV 
FG^NU^RViaiNESVDEADMENAGHPSADTPTATNYFLOYlSS 
SLENSTNSADASSKKFVFGQNMSERVLSPPKLNEVSSDANRENA 
AAESGSES SSQEATPE KESLAESAAA YTKATAR KCLLE K VE VIT 
GEEAES^LOMQCKLFVFDKTSQSWVERGRGLLRLNDMASTDDG 
TLOSRbS DAG PRGSLR \ LI LNTKLWAQMQ 1 DKASEK\S 3 R3 TAM 
DN EDQG V K V FLI $ AS S KDTGQVYAALHHR I LALRS RVEQ EQEAK 
MPAPEPGAAPSNEEDDSDDDDVLAPSGATAAGAGDEGDGOTTGS 
T 


6069 


583 


27 


PTRPGQAGS S SAMAAQR LGKRVLSK1>0S PSRARGPGGS PGGLQK 
RF^^VTV^YDRRELQRRLDVEXWIDGHLEELYRGMEADMPDEIN 
I DELLELESEEERSRK I CGLLKSCGXPVEDFIQELLAICLOGLHR 
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eNSOOCIO: <WO 01 J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 

i 


Predictec 
beginnlnc 
nucleot ide 
locati on 
correspond: ng 
to first 
3mino acic 
residue of 
amino acic 
sequence 


Predicted end 
r.-ccl eotide 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine aci cT secment containing sional peptfTce 
(A-Alanine, C=Cysteine, D=Aspartic Acid, 
Gluta-nic Acid, F^Phenylalanine , G«Glycine, 
K*Histidine, 1= 1 soleucinc , K=Lysine, 
L=i»eucine, M=Methionine , N=Asparaoine , 
T-Proline, Q^Glu tamine , R^Aroinine, 
S=Serine, ^Threonine . V=Valine, 
Vi=Tiypt ophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


! 

i 






C\PG LS^PS PS? \DGQPSAPFC^PGARTASPLTbbAbFPGP PER 
RPALLCVLSC: 


6070 


478 


858 


JRVTVDGEPLHY I K?LQFbDSFEW/RFTETHRGRHF\QV7bTAE 
TDCR YVS WK R XKLYLL FAQHR Y I SRLFS VL3 GSD I ADK L YA LND 
RVY1GKRYKYDI RbPNFYQMSTPElRRSPLTQHFQNSRRYW 


6071 


2 


2 6S4 


HEAP/rKGNNJ\lJ^P\VRLFSLVTRLLLAPRRGLTVRSPDEPLPV 
VRlPVALQRQLEORQSRRRNLPRPVbVRPGPLLVSARRPEbNQP 
AR LTLG 3 WE RAP LASQGWKSRRARRDHFS I ERAQOEAP AVR KLS 
SKGSFADLGAWK?RVLHAL0E\AAPEWQ\PTTVQSST1PSLLR 
GR HVV CAAETGS G KTLS YLLPLLQRLLG \I 1 PS LDS LP I PAP RGL 
VLVPSRSbAOOVRAVAOPLGRSLGLLVRDLEGGHGMRRIRLOLS 
RC PS ADVLVAT PG ALWKALKS RL1S LEQLS FLVbDEADTLLDES 
FL EL VDY ILEKSHI AEGPADLEDP FN PKAQLV LVG ATF PEG VGQ 
LLNKVASPDAVTTITSSKLHC1MPHVK0TFLRLKGADKVAELVH 
I LKHR DRAERTGPSGTVLVFCNSSSTVNWLGY I bDDHKI OHbRb 
0GCMPALMRVGI FOSFOKSSRDIbbCTDIASRGbDSTGVEbWN 
YDFPPTL0DY1HRAGRVGRVGSEVPGTVISFVTHPWDVSLVOKI 
ELAARRRRSLPGLASSVKEPbPQAT 


6072 


- 


742 


KMERTEMMPTIMSOLEFKSKPFPLVSSSRWLVKRGELTAYVL'DT 
VLFSRKTS KQQVy F FL PNDVb I I TXX KS EES YNVNDY S LR DOLL 
VESCDNEELNCSPGKNSSTMLYSROSSASHLFTLTVLSNH/vNEK 

vembbgaetqserarw 1 talghssgkppadrtsltqvei vrsft 
akcpdei.sijQvadvvijI \yqrvsdgwyeger\lrdgergwfpme 

CAKE I TCQAT 3 D KNVER MGR bLGLETNV 


6073 


62C 


860 


PCRRG1AR PLS.R R PG/ S IbVHCAVGVSRSATLVbAY bMLYHHbT 
L\TEAI KKVKDHRGJ IPNKGKLRQUALDRRLROGLEA 


6074 


1 68 


1110 


pgarcmatelccpdsmpchnqqvnsastpspeolrpgdlildha 
GGNRAS RAK V I lltgyahsslpaeldsgacggs s lns egnsgsg 

DSSSYDAPAGNS FLEDCELSRQIGAQLKLLPMNDQIREbCTl IR 
DKTASRGDFMFSADRL.IRLWEEGLNQLPYKECMVTTPTGYKYE 
GVKFEKGNCGVS 1 MRSGEAME0GLRDCCRSIRIGKILIOSDEET 
QRAKV Y YAK FPPD I YRR KVLbMY PI LQTG\NTV I EAVKVLI EHG 
V0PSV3 I LLSLFSTPHGAKSI IOEFPEITI bTTE VH P VAPTH FG 
QKYFGTD 


6075 




1091 


P?TCX)P0EVEHH\YGYVP1LGNKTLPSRCMQCV1VSSSSHL.LGT 
KLGPE I ERAECT 3 RMNDAPTTGYSADVGNKTTYRWAHSSV FR V 
LKRP0FFVN RTF ETVF I FWGP PS KMQKPQGS LVRV I QRAGLVFP 
Nl>lEAY AVS PGRKRQFDDLFRGETGKDREKS HSWLS TGWFTMV 1 A 
VELCDHXWYGMVPPNYCSQRPRLORMPYHYYEPKGPDECVTYI 
ON EHS R KGIWIR FI TE KRVFS S WAQLYG I TFSHPSWT 


6076 


1721 


107 


H PS PTEAPR VOHLTMDCTWR I LFLVAAATGTHAQVQLVQSGAE V 
KKPGASVKVSCKVSGYTLTELSMHWVROAPGKGLEWMGAFDPED 
GET I YAQKFQGR VTMTEDTSTDTAYMELSSbRS EDTAVY Y CATD 
HGDYAFD 1 WGCGTM VTVSSAPTKAPDVFP I 1 SGCRHP KDNS PW 
LACbl TGYH ?TS V\ TVTWYMGTOSQA\QRT FPE3QRRDS YYMTS 
S0LSTPLQQWROGEYKCW0HTASKSKKEIFRWPESPKAOASSV 
P7AQPOA.EGSLAKATTAPATTRNTGRGGEEKKKEKEKEEOEERE 
TKTPECPSHTOPiiGVYbLTPAVODbWLRDKATPTCFVVGSDLKD 
AHLTWETVAG KVPTGG VEEGLLERH SNGS Q SQH S RLTL P R S !i WNA 
flTQVTrTbNTJPQT PPOPt V&T RFPAJvnAPVKI. < ?IJJLLAS'?DPPF 
A\AS WL>LCEV SGFSP PNI LLMHLEDHGEVNTSGFAPARPL? KP \ 
PSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
EVSYVTTJHGPMK 


6077 


3687 


1268 


LLPDMNLOPiFWlGLISSVCCVFAQTDENRCLKANAKSCGEClQ 
AGPNCG WCTNST FLOEGMPTS AR CDDLE ALK KKGCPPDD I EN PR 
GS KD I K KNKNVTNRS KGTAEKLK PEDITQ1 Q PQQLVLRLRSGEP 
QTFTbKF KRAED Y P I DLYYLM \ D LS YSM KDDLENVKS LGTDLMN 
EMR R I TSDFR 1 G FGSFVEKTVNP Y I STTPAKLRNPCTS EQNCTS 
PFS YKNVLS LTNKGEVFNELVGKQRI SGNLDSPEGGFDAI MQ VA 
VCGS LI G WRJfVTR LbVFSTDAGFHFAGDGKbGG I VLPNDG0CHL 
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BNSDOC1D: <WO 0153312*1_L:> 



WO 01/53312 



PCT/US00/34263 



SEC 
ID 
NC: 


Predicted 
begin.ninc 
nucl eot iae 
locatiar. 
corresponding 
to first 
amino acid 
residue c' 
aroino acic 
sequence 


Predicted end 
nucleotide 
location 
correspond i nc 
to first 
amino acic 
residue of 
amino acic 
sequence 


Ammo acid segment containing sional peptide 
(A-Alanine, C=Cysteine, D=Acpartic Acid, E^ 
Glutamic Acid, F«=Phenylal on: ne, G=Glycine, 
H=Mistidine, I = Isoleucine , K^Lysinc, 
L= Leucine, Methionine, N=Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine. r=Threonine, v=Valme ( 
W=Tryptophan, Y=Tyrosine, X=Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\rpossible nucleotide insertion) 








ENNM YTMSH Y YDYPS I AHLV0KLSENN1 QTI FAVT E E FOP V Y KE 
LKNLI FKSAVGTLSAMSSNV1QLI IDAYNSLSSEVI LENGKLSE 
GVTI SY0SY\CKMGVHGTGENGRKCSN3 SIGDEVQFE3SITSNK 
CPKXDSDSFKIRP1GFTEEVEVILQY1CECECQSEGI PESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEWSEDIGCFTARKENQ 
FQK S.ASNWG R VPSAGQCVCR KR DNTNE I YSGKFCECDNFNCDRS 
NGLICGGNGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQIC 
NGRG1CECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTOECSYFMITKVESRDKLP0PV0PDPVSHCKEKD 
VDDCWFYFT Y S VNGNN EVMVHWEN PECPTGPDI I P 1 VAGW AG 
IVLlGLALbLIWKLLMIIHDRREFAXFEKEKMNAKWDTGENPJY 
KSAVTTWNPKYEGK 


607 e 


1421 


1B0 


ETEDVMELLEEDLTCPICCSLFDDPRVLPCSHKFCKKCLEG1LE 
GS VRNSLWRPVPFKCPTCRKKTFSYWEL 3 PLQVNYSLKGIVEKY 
NK I X I S PKMPVCKGH \ LGQPLNJ F\ ClATDMQLDL/ CG I C\ATR 
GEHTKHVFCSIEDAYAQERDAFESLFQSFETWRRGCALSRLDTL 
ETS KR KSLQLliTKDSDKVKEFFEKLOHTLDOKKNEILSDFETMK 
LAVMOAYDPEINKLNTILOEQRMAFNIAEAFKDVSEPIVFLQQM 
OEFREKI KV 3 KETPLP PSNLP AS P LMKN FDTSQWED I ICLVDVDK 
LSLPODTGTFISKIPWSFYKLFLLILLLGLVIVFGPTMFLEWSL 
FDDLATWKGCLSNFSS YLTKTADFI EOSVFY MEOVTDGFFI FNE 
RFK3MFTLWLNNVAEFVCKYKLL 








ATARDLGCARRIDRVVMESTPSRGLNRVHLOCRNbQEFl>GGl>SP 
G VL DR b YG H PATCLAV FR ELPS LA KN WVM RM L FXiEOPL PQAAVA 
LVf\'KKEFSKAOEESTGLLSGLRI WHTOLLPGGLOGLILNPI FRQ 
NLRIALLGGGKAWSDDTSOLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSODIAOLLSOAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAOLWY FMLOYLOTAQSRGMD LVE I LS FLFQLS FS 7LGKD 
YSVEGMSDSLLNFLOHLREFGLVFORKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIAL3ALFSE 
MLYPFP\NKW\ARVTR\ESVQOAIASGITAOXJIIHFLRTRAHP 
WLKOTPVLPPTITD01RLWELERDRLRFTEGVJUYWQFLSOVDF 
ELL\LA>IAPK1^VLVFE/NTPAKRLMVVTFAGHSDVKRFWKR0K 
HSS 


6080 


i 


1199 


IET I DH VG EFAMAAQAAG V SRQRAATQG LG S NONALKYLG QDFK 
TLROQCLDSGVLFKDPEFPACPSALG YXDLGPGSPOTQG3 1 WKR 
PTELCPS PQF I VGGATRTDI CQGGLGDCWLLAA1 ASLTLNEELL 
YRWPRE>QDFQEMY AG1 FH FOPLCP P S ? \FW0YGEV?VEW I DDR 
LPTKNG Q LLF LHS EOGNE FWS AL L E KA Y AKLNGC YEALAGGS TV 
EGFEDFTGGI SEF YDLKKPPANLYQI I R KALCAGS LLGCS I DVY 
SAAEAEAITSOKIjVKSHAYSV'IGVEEVNFOGHPEKLIRLRNPWG 

s veksg aw sddap ewhhi dprrkeeldkkvedge fwmslsdfvr 
ofsrleicnlspdslsseevhkwnlvlfnghwtrgstaggcqny 
?gss 


6081 


3 


865 


emlpllbplpllwa/galaqdarfrlempesvtvoegbclfvhc 
svfyleygwkdstpayghwfregvsvdoetpvatknstokvoke 

TQGRFHJL.LGDPSRNNCSLSIRDARRRDNGSYFFWVARGRTKFSY 

kysplsvtvtalthrpdilipeflksghpsnltcsvpwvceqgt 

PPIF5WMSAAPTSLGPRTLHSSVLT3 1 PR PODHGTNIjI CQVTFP 

gagvttertiqlsvsvjksgtveewvlavgwavkilllclcli 
I lsfkkkkavravevsenvyavmg 


6082 


283 


1288 


EARSPX3PT0TRTAPGLAAPGLAQPAAI/RLLLSRPPSAAMDGDGD 
PESVGOPEEASPEEOPEEASAEEERPEDOOEEEAAAAA\y\LDE 
LPEPLIA/LRVXjAALPRHE\LVQACR\LVCLRWKELVDGAPLW1> 

lkcqoeglvpeggveeerdhwqqfyplskrrrnllrkpcgeedl 
egwctvehg^dgkrveelpgdsgvtefthdesvkkyfassfewcr 
kaq vt: DLQ AEG YW E e lldtto paiwkdwysgrs dag clyeltv 
xllsehenvlaefssgovavpodsdgggmmeishtftdygpgvr 

FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP 


€083 


186S 


309 


KOHCAERRGU5MSLAD5LLADLEEAAE EEEGGS YGEEEEEPA1 E 
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BNSDOCIO- <WO 0t53312A1J_> 



WO 01/53332 



rO7US00/34263 



ID 
NO: 


Predi rtec 
btqmnmc 
miclect ide 
1 oca t ion 
ccr re spending 
to fxrst 
anuno acic 
residue of 
awino acic 
secuenc- 


Predicted end 
nu elect ice- 
location 
ccr responding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, P^Ar.partic Acid, E* 
Glutamic Acid, F=Phenylalanine , G- Glycine , 
H=Histidine, I=Iscl eucine , K=Lyaine, 
L=Leucine, M-Methionine, N=Asparacine , 
P*Proline, Q^Glutamine, R=Arcinme, 
S^Serine, TrThreoni ne , V=Valirifc, 
K=Tryptophan, Y^Tyrosine, >;=Unknown. *=Stop 
Codon, /=pos£ible nucleotide deletion, 
\-possible nucleotide insertion) 








DVQEETQLDLSGDSVKTI AKLWDSKKFAEI KNIK I EE Y I SKQAXA 
S EVMGP VE AAPEY RVIVDAMN LTVE I F.N ELN I1HKFI RDKYS KR 
F PE LES LV PNALDY1 RTVKELGNS LDK C W^ 1 ENLOQ I LTNAT I M 
WSVTASTTOGOOLSEEELERLEEACPKALELNASKHRiyEyVE 
SRKSFI APNLSI 1 IGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AORKTLSGFSSTSVLPHTGYIYHSDIVOSI.PPIFPPPSVAPXDL 
RR KAAR LVAA KCTLAAR VDS FH ESTEGKVGY E LKDE I E R KFDKW 
QEPPPVKQVKPLPAPbDGQRKKRGGRKYRK14KERU3LTEIU\KO 
7U3RMSFGEIEEDAYOEDLGFSUGHLGKSGSGRVROTOVNEATKA 
RISKTL0RTL0K0SWYGGKST1RDRSSGTASSVAFTPL0GLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


6084 


186S 


309 


KQWCAERRGLGMSLADELtiADLEEAAEEEEGGSYGEEEEEPAIE 
DVQEErOLDLSGDSVKTIAKLWDSKMFAElMMKIEEYISKQAKA 
SEVHGPVEAAPEYRV1VDANKLTVE1 ENELUI IHKFI RDKYSXR 
FPELESLVPNALDY IRTVKELGNSLDKCKNNENLQOI LTNAT2 M 
WSVTASTTQGOOLSEEELERLEEACDMALEl.NASKHRIYEYVE 
SRMSF1 APNLS1 1 1GASTAAK JNGVAGGLTKLSKMPACNI MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVCSLPPIPPPFSVAP\DL 
HRKAARIjVAAKCrLAARVDSFHESTEGKVGYELKDEIERKFDKW 
CEPPPVKQVKPLPAPLIX5QRKHRGGRRYRKMKERLGLTEIR\KQ 
ANRMSFGE1EEDAY0EDLGFSLGHLGKSGSGRVRQTCVNEATKA 
RISKTLQRTLUKOSWYGGXSTIRDRSSGTASSVAFTPLOGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKRG^MST 


6085 


2 

- 


14Sfc 


SGPRSFOGNRAVGRISXX3GKRWPEVTLLPGVSSERVRRWRRARV 
GVARVXPGNPWXPSPATQVPR/VPAQVYLPGRGPPLREGEELVM 
DEE AY VL.YKRAOTG A PCLS FD 1 VR DH L.GDN R TEL P LTL Y LCAGT 
OAESAOSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPObELAMVPH YG<5 1 NR VR VS WLG E E P V AG VWS E KGQVEV FALR 
RLLQ W2EPQALAAFLRDEQAQM KP I F S FAG HMG EGFALDWS PR 
VTGRLbTGDCQKNI HLWTPTDGGSWHVDQRF FVGKTRSVEDLQW 
SPTENTVFASCSADAS J RI WD I RAAPS XACM L TTATAHDG D VW 
I SWS RRE PFLLSGGDDGALKI WDLRQ F KSGS PVAT r KQHVAPVT 
SVEWHPQDSGVFAASGADHQITQWDLG / 1 VEKDPEAGDVEADPG 
liADLPQQLLFVHOGETELKELHWHPOCPGLLVSTALSGFTIFRT 
ISV 


6086 


2419 


1357 


GAATOHGGAWNLl>PCNPHGNGLLYAGFN0DHGCFACG>JEAGFRV 
YNTD PLK EKEKOEFLEGG VGHVEMLFR CN YI .ALVGGGKK P KY P P 
NKVM1WDDLKKKTVIEIEFSTEVKAVKLRR\DKIWVLDSMIKV 
FTFTKNPVHQLHVFEVTCYNPKGLCVLCPNSNTCSLLAFPGTHTG 
HVQLVDLASTEKPPVDI PAHEGVLSCI ALNLCGTR I ATASEKGT 
LIRIFDTSSGHLIQELRRGSQAAN1YC1NFNODASLICVSSDHG 
TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 
CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 


6087 


476 


1877 


QNSQRTGLPITI FSRSFPLI>TGSDLCtNMPL rOTI WKWWKyiMKr 
t»VA V I Y LVS 1 WAV P LCVWE LQKLE VG 1 HTKAW F I AG I FLLLTT 
P 1 SLWVILGHLVHYTQPELQKP1 1 R1LWMVPI YSLDSWI ALKYP 
G I A 1 Y VDTCRECYBAYVI YNFMGFLTN YLTNRYPNLVL I LEAKD 
QQKHFPPLCCCPPWAMGEVLLFRCKLG VLQYTWR P FTT1 VAL I 
CELLGI YDEGNFSFSNAWTYLV 1 1 NNMSOLFAMYCLLLFYKVLK 
pet coTOPV^KITTrvif I.WFV<5FWOaWl ALLVKVGVISERHTW 
EWgTVEAVATGLODFI I CI EMFLAAI A\ HHYTFS YKPYVQEABE 
GSCFDSFLAMWDVSD 1RDDI SEQVRHVGRTVRGHPRKKLFPEDQ 
DCNEHTSLLSSSS0DAISIASSMPPSPKGHYOGFGKTVTPQTTP 
TTAKI SDEILSDTIGEKKEPSDKS VDS 


6088 


1684 


689 


G ASGbVRLLQQGHR CLLAPV AP KLV PPVRGVKKG FRAAFR PQKE 
LERORLbRCPPPPVRRSEKPNWDYHAEIOAFGHRLOENFSLCLIi 
KTAFVNSCYIKSEEAKROQLGIEK£AVLLNLKSNOELSEC?GTSF 
SOTCbTQFLEDEYPDMPTEGI KKLVDFLTGEEVVCHVARNLAVE 
QLTLSEEFPVPPAVLOQTFFAVIGALLCSSGPERTALFIRDFXI 
TQMTG KEL FEMW K 1 1 N PMGLL VE EL KKRNVS AP ES RLTRQSG \ A 
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Amino acid segment containing signal peptide 
(A=Alanine, C=C/steme ( D*Aspartic Acid, E« 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H^Hiut idine , I«I soleucine , X=Lysine, 
L=t»eucine, M=Methionme, N=Asparacine, 
P=Proline, Q=Glutamine, R--Aiginine, 
S=Serine. T-Threonine , V=Valine., 
W=Txyptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibJe nucleotide insertion) 








PTALFLYFVGLYCDKXLI AEG PGETVLVAEE EAARVALR XLYGF 
7ENRRP WNYSKPKETL3AEKS I TAS 


6089 


3 


3054 


TRU5JPG$TXSSRPRLCALAAEGHFLGHSWTGSRAGAHTGAPAW 
PSR RLRDLPAGGMWRLRRAAVACE VCQSLVKHSSGI KGSLPLQK 
LHbVSRSIYHSHHPTIiXLORPOLRTSFOQFSSLTNLPLRKLKFS 
P I K YGYQPRRNFWPARIATRLLKLRYL I LGSAVGGG YTAXKTFD 
QWJOMIPDLSEYKW1VPD1VWE1DEYIDFEXIRKALPSSEDLVK 
LAPDFDKIVESL^LLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEK3DQLQEELLHT0LKY0R1LERLEKENKELRKLVLOKDD 
KGIPFIESbRKSMDMYSEVLDVliSDYDASYirrODHLPRVWVG 
DQS AGKTS VLEMI AQAR I FPRGSG EMMTR SPVKVTLSEG PHH VA 
LFKDSSREFDLTKEEDLAALRHEIELRMRKNVKEGCTVSPETIS 
LNVXGPGLQRMVXVDLPGVIhrTVTSGMAPDTKETI FS 1 SXAYMQ 
DPNAIILCIODGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVAS PSRIQQI I EGKLFPMKALGYFAWTGKGNSSESIEAI 
REYEEE F F0NSKULKTSMLKA>:0VTTRNLSIiA VS DCFWKMVRES 
VEOOADSFKATRFNLE?EWKNNYPRliRELDRNEL»FEKAKNEILD 
EV I S LSQVTPKHWEEI LQQSLWERVSTHV 1 ENI Y LPAAQTMNSG 
TFN TTVD 1 Xl»XQWTDXQl» PN KAV EV AVIET LQEEF SRFMTE PKG K 
EHDD 1 FDXLKEAVKF.ES I KRHKWNDFAEDSLRVI QHNALEDRS I 
SDKO0WnAAlYFM£EAL0ARLKDTERAIENMVGPD\WKXRWLYW 
K^RTQEOCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRG V E VDPS I* I K DTWKQ V Y RR H FI i K TA LNHCNLCRRGF Y Y YQRH 
FVDS ELECNDWLFWRI QRMLAI TANTLRQQLTNTEVRRLEKNV 
KEVLEDFABDGBKKIXLLTGKRVQLAEDLXKVREIOEXLDAFIE 
ALHQEX 


6090 




1560 


PVFVPAPGAVLEQAS/ASPPLATOT WPLQKCKI PELPVOAS IL 
KELQLFFCQL3 ALFVHY 3 N I Y KT VWWYPPSHP PSHTSLN FHL I D 
FTJLLMVTTIVLGRRFIGSIVKEASQRGKVSLPRSILLFLTRFTV 
LTATGWSbCRSLIHLFRTYSFLNLL/FPLLSVWDVKSVPAAEuR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTIiR\RRGSSTQDS 
CMARTPCP / PIIACCLS PSLJRS EVE FLXMDFNWRMKS VL VS SML 
SAYYVArVPVWFVKNTHYYDKRWSCELFLLVSJSTSVlliMOHLL 
PASYCDLI>HXAAAJ1J^CW0XVDPALCSNVU)HPWTEECMWP0GV 
LVXKS KWYXAVGHYNVA I PSDVSH FR FH FFFSXPLR ILNI LLL 
LEGAVJVyOLYSLMSSEKWHOTISI>AJUILFSNYYAFFKLLRDRL 
VLGKAYSYSASPORDLDHRFS 


6091 


3279 


412 


SSRTREMEEXElLRRQIRLLOGblDDYXTLKGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRXKYSLVNRPPG 
PSDP PADHAVRPLHGARGGQP PVPQOHVLER0VOLSQGQNWIK 
VKPPSXSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
bQPSRPTRARGTCSVSDPLLVCQKEPGXPRMVXSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRXJLGSHSVASCAPQ 
LLGDR R VD AGHTDQP VPSGS VGG PARPAS G PRQAREAS LWTCR 
TNKFR XNN YKWVAAS SKS PRVARRAL S PRVAAENVCXASAGMAN 
KVEKPQLI ADPEPKPRKPATS S XPGSAPS XYKWKASS PSASS SS 
SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 
T P 1>S A Y XV K SRTX IJRRRGSTSLPGDKXS GTS PAATAKSHLSLR 
RROALRGXSSPVLKXTPNXGl>VOVTTHRLCRLPPSKAHLPTKEA 
SSLHAVRTAPTSKV: XTRYR 1 VXXTPASPLSAPPFPLSLPSWRA 
RR 1>S US R S LiVLNRLR P VASGGG KAQPGS P WWRS KGY RC I GGVL Y 
KVSANKLS KTSGQP SDAGSRPLLRTGRLp PAGSCSRSIASRAVQ 
R SlAI I RQ ARQRRE KR XE YCMY YNRFGR CNRG ERCP Y I HDPEKV 
AVCTRFVRGTCKKTIX5TCPFSHHVSKEKMPVCS YFLKG I CSNSN 
CP YSHVYVS R XAEVCS DFLKGY C P LGAXCXK XHTLLCPD FARRG 
ACPRGAOCQLLHRTOXRHSRRAATSPAPGPSDATARSRVSASHG 
PRXPSASORPTROrPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSSPPAS1X>HEAPSL0EAALAAACSNRLCXLPSFISL0S 
SPSPGAQPRVRAPRAPLTKDSGXPLHI XPRL 


6092 


143 


3190 


AXAPPTGES SEPEAKVLHTKRL YRAVVEAVHRLDIjI LCNXTAYQ 
EViTCPE^JSJ^RNXIJRELCVKl^FLHPVPYGRKAEEbLWRKVYYE 
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Amino acid segment containing signal peptide 
(A^-Alanine, C= Cysteine, D-Aspartic Acid, E~ 
Glutamic Acid, F- Phenyl alanine , G^Glycine, 
K=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *s=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








V 1 QLI KTN K KH I HSRS TLECA Y R TH LVAG IG F YQHLLL Y I <?S H Y 
OLELQCCIDWTHVTDPLJGCKKPVSASGKEMDWAOMACHRCLVY 
LGDLSRYQNEI^vGVDTELUAERFYYQAIjSVAPQIGNPFNOIGTL 
AGSKYYNVEAKYCYLRCIOSEVSFEGAYGNLKRLYDKAAKKYHQ 
LKKCETRKLSPGKKRCKD3KRLLVNFMYLQSLLQPKSSSVDSEL. 
TSLCQSVLEDFNLCLFYLPSSPNbSLASEDEEEYESGYAFLPDL 
LI FQMVII CLMCVHSLEKAGSKQYSAAI AFTLALFSHLVNHVNI 
RWABhXEGENPVPAFCSDGTDEPESKEPVEKEEEPDPEPVPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRSPTLE PPRGR£ EAPDSLNG PLGPSEAS1 ASNLQAMSTQ 
MFQTKRCFRLAPTFSNLbLQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLOVLMAEGLLPAVKVF 
LDWLRTNPDLlIVCAQSSOSLWNRLSVbl^LLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPIjLSTLEESWRICCIRSFGHFIARLQGSILQFNPEVGIF 
VSIAQSEQESLLQQA0AOFRMAOEEARRNRLMRDMAO1.RLOLEV 

sqlegslqqpkaosams p ylvpdtoalchhlpvi rqlatsgrfi 
viiprtvidgldllkkkhpgardgiryi.eaefkkgnryircoke 
vgksferhki.krodadawtlykii^dsckqltXlaqgageedpsg 

MVTI I TGLPLDNPSLIjSGPMOAALOAAAHASVDI KNVLDFYKQW 
KEJG 


6093 


76 


1002 


ACGRRAWLALR VAR T/S R WGAL \ RGAVWAPGTR PS KRRACWALL 
P P VPCCLG CLAER WRLR P AALGL.RLPG IGQRNHCSGAGKAAPR \ 
PAAGAGAAAEAPGGOWG PASTPSLYENPWTI PNMLSMTR I GIAP 
VLG YL 1 1 EEDFN I ALG V FALAG LTDLLDG F I ARN MANQR S ALGS 
ALDPLADklLI SI LYVSl/TYADLl PVPLTYM J ISRDVML1AAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVOLILVA 
AS LAA P VFN Y ADS I YLQ 1 LWCFTAFTTAASAY SY YH YGR KTVQV 
IKJD 


€094 


23 


1020 


PFLR ClsXGDQXAXMSER X VLmvyPPDEDPSXl PXLKLP-KDRQY 
WRLWAPFNMRCK.TCGEYIYKGKKFNARKETVQNEVYLGLPIFR 
FY3KCTRCLAEITFKTDPENTDYir»EHGATRNFCAEKLbEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQEbkDLNQR 
OAHVDFEAMLRQHRLSEEER RR0O0EEDEQETAALLEEAR KRRL 
LEDSDSEDEAAPSPL0PALRPNPTA1LDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGOPOA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCJ PGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYODLIDRGWRRSGKYVYKPVMNQTCCPQ 
YT3RCRPLQFOPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIOCDLKTLSDDIKESLESEGKNSKKE 
EPQELLOSODEVGEKLGSGEPSHS 



TRADOCS:!4l6257.1(V^CSH0n.DOC) 
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Amino acic segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, fc s 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidme, 1 = 1 soleucine, K^Lysine, 
L=Leucine, M=Me t hionine, N»Asparagine , 
P=Proline, 0=Glut.amine , 3-Arginine, 
S=Serine, T*= Threonine , V~Valine, 
►'■Tryptophan, Y^Tyrosine, X=Unknown, # -Stop 
Codon, / ^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGXGADLSKPPCRKAKEIRKERKRLKLMQONPAGEL 
EGFOAQGHPPSLFPPKAKSNQPXSLEDLIFESLPENASHKLEVR 
WRSSPPSSQFKATLLESY0VYKRYOMVIHKNPPDTPTES0FTR 
FLCSS PLE7*ETPPNGr DCGYGSFHQQYWLDGKI I AVGV ■ DILPN 
CVSSVYIYYDPDYSFLSLGVYSALREIAFTRQLHEKTSCLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQOKDPS 
EEAAVLQYASLVGQKCS ERMLLFRN 


6096 


2277 


S7S 


QRVRAJ\LLSSAMEDSEALGFEHMGLDPRLLQAVTDLGWSRPTLI 

qekai plalegkdllarartgsgktaayaipmlolllhrkatgp 
vteoavrglvlvptkelarqaosmiqqlatycardvrvajwsaa 
edsvs qravlmekpdwvgtpsr3 lshlcodslklrds lellw 
ceax>llfsfgfeselksllchlpriyqaflmsatfnedvtfalke 
lilhnpvtlkloesolpgpdoloqfqvvceteedkflllyallk 
lsllrgksm.fvmtlersyrbrlfleqfsiptcvlngelplrsr 
CHI 1 sqpnqgfydcvi atdaevlgapvkgkrrgrgpkgdkasdp 
eagvargidfhkvsavlnfdlpptpeayihragrtarannpgiv 
ltfvlpteqfhlglo e ellsgenrgpi llp yqfrmee 1 egfryr 
crdamrsvtkoairearlkeikfellhseklxtyfednpr\di<? 
llrhdlplhpawkphlghvpdylvppai>rglvr?hkk\grscl 

PLVG R PR EQS PR THCAAS ST KERKS DPQPSP V E WG P L»W S 


6097 


1673 


192 


APGTMSGGKKKS S FQI TS VTTDYEG PGS PGASDPPTPOP PTGPP 
PRLPNGEPS PDPGGKGTPRNGSPPPGAPSSR FR WXJLPHG LGEP 
YRRGRWTCVDVY ERDLEPHSFGGLLEGI RGASGGAGGRSLDSRL 
ELASLGLGAPTPPSGLSOGPTSWLRPPPTSPGPQARSFTGC-LGO 
U WPS KAKAEKP PLSASSPQQRPPEPETGESAGTS RAAT P LPS L 
RVEAEAGGSGARTPPLSRR KAVfcMRLRMELGAPEEMGQV P FLDS 
RPSS PALYFTHDASLVHKS PDPFGAVAA0KF5 LAHSMLA 1 SGHL 
DSDDDSGSGSLVGIDNKJEOAMDLVKSHLMFAVREEVEVLKEQI 
RELAERJN AALEQ ENGLLRALA\ S P EQLG S AGP P RG VP R\ LG P PA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALSNGPWSPG PLPHLL1 3 PS LDGGGEGFRTGRQQCAP FGE^T 
QPPPSLPGTP00 


6098 


168 


1C74 


KYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKHWGT0TEKEDTSN1NPRQTETSVNASRSPEKCAO0ROK 
RLNSASQRSSSLPPSNRKSSTPTKRE3MLTPVTVAYSPKRSPKE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLORTNFRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEXNKSVSYEOCX?VSVTPOGNDFBYTAXIRTLAETERFF\D 
ELTKEKD0 1 E AALSR M PS PGGR 1 TLQTRLNQE A FGRS FG KI • 


6099 


166 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGT0TEKEDTSN1NPRQTETSVNASRSPEKCA00RQX 
RLNS AS QRS SSLPPSNR KSSTPTKRE1 MLTPV1 VAYS PXR S PKE 
NLSPGFSHLLSXNESSP1RFDILLDDLDTVPVSTLQRTNPRK0L 
\ QFLPLDDS E EX\ TY S EKAT\ DN I VNHS S CPE PVPNG VK KVS VR 
TAWHKNKSVSYEOCKPVSVTPQGNDFEYTAXIRTLAETERFF\D 
ELTKEXDO 1 EAALSRM PS PGGR I TLQTRLNQEAFGRS FG KD 


6100 


2 


713 


FVE V3G Y RS RADPEPR G R DTMT YA Y LFKY III GDTG VG XS C LLL 
QFTBKRFOPVHDLTIGVEFGARMVN I DGKQI KLQ1 WDTAGOESF 
RSI TRS Y YRG AAGALL VYDI TR R ET FNHLTS WLEDARQHS £ SNW 
VI ML I GNKSDLESRRDVKREEGEAFARE\HGLI FMETSAKTACN 
VEEAFINTAKEI YRK100GLFDVHNEANGIKIGPQQS I STS VGP 
Sf&QIWSRDlGSNSGCC 


6101 


1 


139S 


FRGRAWPLRE VSKWLGCR RVCS WSAS WGRLP ALSARLS PLLAFR 
GXhWFPLSCAVOO^WGKHGSNSEVARLLASSDPLAOIAEDKPY 
AELWMGTHPRGDAKILDNRISOKTLSOWIAENODSLGSKVKBTF 
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Amino acia ;-eoment containing iional peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid. F^Phenyl alanine, G=Giyr:ine, 
}J=;listidine, 2=2soleucine, X=Lysine, 
h- leucine, KrMethionine, N^Asparaaine , 

r-'IU^illC^ V - <0 J. Li L cl 1 1 1 -L * IC , r\L \£ 1 I ..i I JC / 

S=Serine, T»Thrconine, V=Vahne, 
W=Tryptophan, Y^Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) • 








NGNLPFLFJCVIjSVETPLSIQAHPNKELAEKLHLQAPOHYPDANH 
K P EMA I ALT? FQC LCG FR PVEE I VTFLK KVFE FQFL I GDEAATH 
I.XQTMSHDSOAVASSLOSCFSHLMKSEKKVVVBOLNLLVKRISQ 
QAAAGNNMEDI Ft;ELLLQLHQQYPGDlGCF7U YFLNLLTLKPGR 
AWFLEANVPflAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSCEDPYUS1YDPPVPDFTIMKA\EVP 
G\SVTEYKDLALDSASII,L^0GTVIASTPTTQTP1PL0RGGVL 
F I G ANESVS L K LTEPK DLL2 FR ACCLL 


6102 


70 


241b 


QTPQATLAANGAEDSRGGEMLPAG3IGASPAAPCCSESGDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
S LXX LDKL I EQRTVS KMQLEEQVLT2SS E 1 PXRI RSALKNAEES 
XOFLNQFLEOETHI.'FSAINSHLLTAQPKKDDLGTMISOIEEIER 
HLAYLKW1S0T EELSDN1Q0YLMTNNVPEAASTLVSWAELD1 Kb 
QESSCTHLLGFMRATVKFWHKlLKDKLTSDFEEILAObHWPFJA 
PPQSQTVGLSRPASAPEIYSYLE"1LFCQLLXLQTSHELLTEPK\ 
HSQKNTLFLPPLLSS /WP IQVMLTPLQKRFR YHFRGNRQTNVLS 
XPEWYLAQVl.MWlGNniTEFLDEKlQPILDXVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGyP 
GTFAS CMH 2 LS EETCFQR WLTVERKFALOKWDS MLS S E AAWVSO 
yXDITDVDEMKVPDCAETFMTLLLVITDFYKJ^LPTASRKLOFLE 
LOKJDLVDDFR J R LTQVMKEETRASLG FR YCA I LNA VNY I STVhh 
DWADNVFFLObOQAALEVFAENNTLSKXCLGOLASMFSSVFDDM 
I NLLERLKHDMLTRQVDHV FREVKDAAKLYKXERWLSLPSQSEO 
AVNSLSSSACPLLLTLRDHLLOLEOQLCFSLF.XIFWOMLVEKLD 
VY1YQEI I LANHFNECGAAQLQFDMTRNLFPLFSHYCXR PENYF 
KH 1 KEACI VLN LNVGSALTAGXDVLPVQLQCS FPAT 


6103 


207 


2S23 


ES NSTMTT YLE FI QOKEERDGVR FS WNVWPS S RLEATRM WPVA 
ALFTPLXERPDLPP1QYEPVLCSRTTCRAVLNPLCQVDYRAKXW 
ACN FCYQRNQ F P PS YAG I S ELNQ P AELLPQFS S I E Y WLRGPQM 
PL J FLY WDTCMEDEDLQALKESNOMSLSI .LPPTALVGL 2 TPGR 
MVQ \HELGCEG 1 S XS YVFR GTXDLSAKQLOEM LGLS KVP VTQAT 
RGPQVC^PPPSNRFLOPVQXIDMNLTDLLGEI.QRDPWPVPOGKR 
PLRSSGVALS I AVGLLECTFPNTG/vR I MMF1 GG PATCGPGMWG 
DELXT P I RSWH D I DXDNAK YVK XGTXHFEALANRAATTGHV I Dl 
YACALD0TGLLEMKCCPNLTGGYMVMGDSFNTSLFKOTFORVFT 
KI)MHGQ?K>1GFGGTLE3KTPR\E2XISGAIGPCVSLNSKGPCVS 

PKinr'Tr^TPrtLIVTt I'Y'T CDTTTT BTVryXAfMOUNADTT)nr , p\ DC 
bNfclo lt>o H»yWfti LVjLoF 1 1 XLlHX I r C V VIMUnrJMk'l fybVa \K^> 

A\IQFVTOY\OHSSGgRRlRVTTiARN\WADAOTQIQKIAASFD 
OEAAAILMARLAI YRAETEEGPDVLRWLDRCL1 RLCQXFGEYHX 
DDPSSFRFSETFS^YPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQD LT0S L 1 M 1 0 P 2 LYA YS FSG PPE P V LLD S S S I LADR I LLM 
DTFFQILIYHGETIAQWRKSGYQDKPEYENFRHLLOAPVDDAQE 
I LHSR FPM PRY I DTEHGGS QARFLLS KTVNPSQTHNNMYAWGQES 
GAPI LTDDVSLQVFMDHLKKLAVSSAA 


6104 


124 


732 


KVSEYlIbSXDKILFKALAMLVLWSPWSAARGVLRNYKERLLR 
KLPOSRPGFPS PPWGPALAVQ\AQPCLQS00«X2 P VEVKRI /RSL 
LDS1 FWMAAPKNRRT2EVNRCRRRNPQKLI KVKNNI DVCPECGH 
LKQXKVLCAYCYEKVCXETAEIRRQIGKQEGGPFKAPT1ETWL 
YTGETPSEQDOGKRI IERDRKRPSWFTQN 


6105 


3 


9B3 


PLHGACTS LVLQR FCHRR PRPCAPARP EDMRR PAAVPLLLLLCF 
GSORAKAATACGRPRMLNRMVGGUDTQEGEWPWQVS 1 QRNGSHF 
CGGSL I AEQWVLTAAHCFRNTS ETSLYQVLLGARQLVQPGPHAM 
YAR VRQV2SNPLYQGTASSADVALVELEAPVPFTNY I LPVCLPD 
PSV1FETGMNCWVTGWGSPSEEDLLPEPR1L0KLAVPIIDT\PR 
CNLLYSK2)TEFGY0PKT2XNDMLCAGFEEGKKDACXGI>SAGPLV 
CLVGOSWLQAGVI SWGEGCARQNRPGVYIRVTAHHNW2 HRI 1 PK 
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BNSDOCID: <WO 0153312A1 J_: 



WO 01/53312 



PCT/USDO/34263 



ID 

NO: 


Predicted 
bpginnmc 
nucleotide 
locat : on 
corresponding 
to first, 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond} ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ar.inc £cid segment containing signal pcptioe. 
(ArAianine, C«Cysteine, D^Aspartic Acid, E: 
Glutamic Acid, F=Phenylalanine , G-Glycme, 
H-Hi stidine, I = Isoleucine , K=Lysint, 
L=Leucme, M=Methionine , N=Asparagane , 
P=Prollne, 0=Glutamine, R=Arginine, 
S-Senr.e, T=Threonine, VrValine, 
W=Tryptophan, Y=Tyrosine, X = Unknown, *»=Stop 
Cooon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 
AGIjRRDReALRRWPLRRAPliARATRRRACSPRRCAPRPRACPCG 
WSRARHCPGGLCLLLIiLLCOFMEDRSAOAGNCWLROAKNGRCCV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
1PCKETCEK\/PCGPGKKCRMNKKNKPRCVCAPDCSN1TWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYOGRCKKTCRDVFCPGS 
STCVNVDCTNTJAYCVTCNRICPEPASSEQYLCGNDGVTYSVSAC 
HLRKATCLLGRS1GLAYEGKCIKAKSCED1QCTGGKKCLWDFKV 
GRGRCSLCDEbCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKtfSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


6107 




168 


SRCS£PRPEPGRGRGK/],SPSEHRKWVEVFKACDEDHKGYLSRE 
DFKTAWMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRX 
KKEAQRYRNEVRHIFTAFDTYYRGFLTLEDFKKAFRQVAPKLPE 
RTVLEVFREV\DRDS\DGHVSF 


6108 




1348 


GC-SI,RFSPPRVPS.CSRVFCPVPPGGCGbP«PMSASRPOSPTTPW 
CLPRR Y MKH KR DDG PEKOEDEAVDVTPVMTCVFWMCCSMLVLL 
YYFYDLLVYWIGI FCLAS ATGL Y SCI »APC VR R LP \ £ AS AG E S A | 
LLAPT 1 PKTNSLP YFHKRPQARMLLLALFCVAVS WWGVFRNEDQ 
WAWVLQDAIjG I AFCL.YMLKTIRLPTFKACTLLLLV1,FLYD1FFV 
PI TP FLTKSGSS I MVEVATGPSDSATREKLPMVLKVPRLNSS PL 
ALCDR PFSLLG FGD I LVPGLLVAY CHRFD3 QVQS SR VY FVACTI 
AYGVGLLVTFVALAJjMORGOPALLYLVPCTLVTSCAVALWRREL 
GVFvn'GSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS PWPAEQS PKSRTSEEMGAGAPMREPGS PAES EGRDOAQPS 
PVTQPGASA 


6109 


] 


1381 


crsragaasggailegtklrrorvdtnkpldplvpsalraamly 
ledylem 1 eclp mdl>rdrftemremd lq vqnamdqleqrvs e f f 
mnakkn k pewree0mas1 kkdy ykaledadekvqiiano 1 y dl vd 
rhlrkld0elakfkmeleadnagiteilerrsleldtpsopvnn 
hhahshtpvekrkynptshhtttdhipekkfkseal^tltsda 
skentlgcrnknstassnnaynvnssqplgsyn1gslssgtgag 
gi\'jmaaaoavoataomkegrrtsslkasyeafknndf0lgkef 
smaretvg y s s s s almttltqnasss aads rsgr ks knnnkss s 
0ossssssssslssgsssstwqe1so0ttwpesdsnsovdwt 
ydpnb pryci cn0vs ygemvgcdtqdcpi ewfk ygcvglteapk 
gkwycpqctNaamkrrgsrhk 


6110 


7/ 


2464 


ACPSAATMSDODHSMDEMTAWKIEKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPSPLAJ.LAATG^RIESPNENSNNS 
OGPSQSGGTGELDLTATQLSOGANGWQI I SSSSGATPTSKEQSG 
S S TNGSNGS ES S KNRTVSGGQYWAAA PNL0NQOVLTGLPGVMP 
NI QY0V1 PQFQTVDGQQLQFAATGAQVQODGSGQI01 1 PGANQQ 
1 1 TNRCSGGN1 1 AAMPNLI^AVPLQGLANNVLSGQTQYVTKVP 
VALiNGNI TI.LPVHSVSAATLTPSSQAVTI SSSGSQESGSOPVTS 
GTTIS S AS IjVSSQASSSSFFTNAJ*SYSTTTTTSNMG I MNFTTSG 
SSGTNSQGOTPQRYSGIjOGSDALNIOONOTSGGSLOAGOOKEGE 
0\ NQQTQAA P KSKSR P0LVQGG\QALQ\AFQAAPLSGQTFTT0A 
I SOETLQNLQLQAVPNSGPI I I RTPTVGPNGOVSWQTLQLCNLQ 
VQNPQAOTI TLAPMOGVSLGQTSSSNTTLTPIASAASI PAGTVT 
VNAAOLSSM PGL0TINLSA1X5TSGIOVHP IQGLPLAIANAPGDH 
GAOLGLHGAGGDC-IHDDTAGGEEGENSPDAOPOAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKOHICHICGCGKVYGKTSHLRAHLRW 

htgerpfmctwsycgkrftrsdelorhkrthtgekkfacpecpk 
r fm r sdh ls kk i kthqn kkgg pg vals vgtlpldsg ag s egsgt 
atpsal 1 ttnkva1-5ea i cpeg i arlansgi jjvkeggqfcs pint 

SANGF 
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SEQ 
ID 

MO: 


Predicted 
beginning 
nucleor ide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Kiscidine, I=3soleucine , K=Lysine, 
L-Leucine, M^Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginme. 
S=serine, TrThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknovn, * = Stop 
Codon, /^possible nucleotide deletion, 
Vposeible nucleotide insertion; 


6111 


163V 


797 


R\T)PRVRGAMAPWGKRIAGVRGVLLDISGVLYDSGAGGGTA1AG 
SVEAVARLKRSRLKVRFCTNESOKSRAELVGOLOftLGFDI SEQB 
VTAPAPAACOILKERGLRPYLLIHDGVXASEFDOIDTS/STPNC 
WI ADAGES FSYQNWNNAFQVLMELE KPVL1 SLGKGRYYKBTSG 
LMLDVGP VMKALSYACG I KAEVGG K ? S PE F FKS ALQA 1 G VE AHQ 
AVM3GDDIVGDVGGAQRCGMRAL0VRT3KFRPSDEHHPEVKADG 
YVDN LAEAVDLLLQliADK 


6112 


77 


196 


KSSKKSFKSKKFIiAKKQKPNRPJ LQW J WJ.KTGNK 1 RHNWK 


6113 


1779 


567 


WEGRSWAACGVW^OGAWGEKSGVRASEAESPGKRADVfWWSROL 
ETMVDHLAMTE1NSQRIAAVESCFGASGQFLALPGRVLLGEGVL 
TKECRXKAKPR1 FFLFNDILVYGS1 VLMKRKYRS0HI1PLEEVT 
tELLPETLOAKNRWMIKTAKKSFWSAASATERQEWISHJEECV 
RPOi>RATGRPA\STEHAAPWl PDKATD1 CMRCTOTRFSAJbTRRH 
HCRKCRVWCAECSRORFliLPRLSPKPVRVCSLCYRELAAOORK 
EEAE EQGAGVPRAASHLARP I CGR PVEMTMTPTRTRRAAG7ATG 
PAAWSSTPRGWPGLPSTADPRPAEH1>SPSQLHCPGPQEGSSRSC 
PGLRDPIPWKQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFR 
KPONTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSCSKPASSSWPCGPRRCCTR 
RR R CS PR CGLAAG SMCSCS PS WRCT PV PACWPS P P P \ PAEQVQC 
G HLP PHADRRAl/R LP VAAP ARG PG PG H PAG PAG PR PARTP PA S P 
HGPGRPTVPAPPCPMAATEPTPSRPHQRWTREDRMLGRGSQVT 
GRPOWFbRGLVLFSIi 


611b 


324 


71 


DVCC-RVCAHPHLYTH 1HMH 1 CAHAC \ I HTHAQLC/ I TASHALAH 
5HLY 'I'CM VMLT AS HT P SHTHP HTAVH K EHRADVLRGTLTP LR 


6116 


595 


1430 


TGVK P PGR WHAA/ 1 SSSGPVFEGARA\ LQTVXKEEEDESYTPVQ 
AARPQTLNRPGQBLFaQLFRQLRYWESSGPLETLSRLRELCRWW 
LRPDVLSKAQI I>E1>LVLEQFLS I LPGE LRVWVQLHN PSSGEE\L 
WPCKR$CRGTLMGHPGGTRALP\ EPRCALDGYRS \ LRSAQ1 WS I. 
AS PLR SSS AliGDHLE P PYEI EARDFLAGOSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQUQETMTFKDVEVTFSODEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 


VGVPSPAPPCSWBVGPGGGWTPGILKEGOGGRRTPLLLLATRTR 
GIjLSLFPP AAMHP AAFPbPVVVAAVLKG AAPTRG L I RATSDHNA 
S MDFADLP ALFGATLS OEGLQG FLVEA>i P DN ACS P 3 AP ? PPAP V 
NGSVF1ALLRRF1)CNFDLK\^NA0KAGYGAAVVHN\'MSNELLNM 
VWNSEEIQQQXWIPSVFIGERSSEYL.RALFVYEKGARVLLVPDN 
TFPLGYYL I PFTGI VGLLVLAMGAVMI ARCIQHRKRLQRNRLTK 
\E0LKQI\ PTHDYQKGDQYDVCAI CLDEYEDGDKLRVLPCAHAY 
HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDB 
GEPRDKPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPIi 
SPPSSPVILV 


6118 


1044 


247 


STI SCRACTSGATPGAQSHRSARGHAAGG KETAALGMERGlCVKK 
KEKEKETQXEX I GEKGREEKVKRKEVEQK 3 KQSKQEKQERRKGK 
EKEE KRTKQGKETNK E K EQFKGQEE KGEK KDS TLTR T PLE P LEK 
HKQl I WbGLDGAGKTS VLHSLASNRVOHS V APTOG FHAVC I NTE 
DSQMEFLE3GGSKPFRSYWEMYLSN/ADSL7JISFSVGFKQDS0P 
ITWKAKKYLHQLIAANPVLPLVVFANK0DLEAAYHITD1HEALA 
II 


6119 


1217 


462 


DPRFVTENTTKAPAOBRTTQPRSSREGTLRSTMEYLSALNPSDL 
LRSVSNISSEFGRRVHTSAPPPORPFRVCDHKRTIRKGLTAATR 
0E LLA K ALETLLLNGVLTbVLEEDGTAVDS EDFFQIXEDDTCLM 
VLQSGOS«SP?RSGVLSYGLGRERPKHSKDIARFTFDVYKOHPR 
DLFGS LNVKATFYGLYSMSCDFQGli\GPKKVLRELLRWTS TkLQ 
GLGHHLLG1 SSTLRHAVEGAEQWQOKGRLHSY 
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BNSDOCID: <WO__0153312At_l_> 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot i de 
loca t ior. 
correspcnding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Pbenylslanine , G=Glycine, 
H»Histidine, 1 = 1 j-o} eucine , KsLysine, 
L=I*eucine r M=Methjonine , N=Aeparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S^Serine, T-Threcr.ine , V=Valine, 
^Tryptophan, Y^Tyrosine, X=Unknovn, *=Stop 
Codon, /=po5sible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


17 5 


hi RAGGGGLS SRAXVGSGAC LS LVARANG KG LPRGR KE FV KAVR 
VRYVAFR V RYPRAVCLR LWS CRREV 1 WSGRGKQGGXVRAKAKSR 
SSRAGLQFPVGRVRRLLKKGNYAERVGAGAPVYLAAVLEYLTAE 
1 LELAGN AAR DN K KTR 1 1 P R H LOLA I RND EELNKLLG K VT I AQG 
CWLPNIOAVIXPKKT^CKDEGANDP 




1612 


107 


FVRA0ARGSRQPVRRP1.LGAGSRLRCRSCGRMEPLKVEKFATAN 
RGNGLRAVTPLRPGELiFRSDPLAVTVCKGSRGWCDRCLLGKE 
KLMRCSQCRVAKYCSAKCOKKAWPDHKRECXCLKSCKPRYPPDS 
VRLLGRNA'FKI^MDGAPSESEKLYSFYDLESNINKLTEDKKEGLR 
QLVMTFQHFMREE1QDASQLPPAFDLFEAFAKVICT*SFTICNAE 
NQEVGVGLYPSISLLNHSCDPNCSIVFNGPHLLLRAVRDIEVGE 
ELTICYLDMLMTSEERRKOLRDQYCFECDXCFRCQTQDKDADML 
TGDEOVWKEVOESLKKlEELKAHWKWEOVIiAMCQAJJSSNSERL 
PDI N I YQLXV LDCAMDACI NLGLbEEALFYGTRTWEPYR I FFPG 
SHPVT?GVQVMK1/GK1>QLHQGMFPQAMKKLRLAFDJMRVTHGREH 
SL1 EDLI LLLE/AMRRQHQS2LRERSQREJRRV$LLNALLRSHT 
LCFVS CVNLS YWKFCS V FV 


6122 


2 


232« 


R FRXMADGGAASQDESS AAAAAAADS RMNN P S ETS KPSMESGDG 
NTGTOTNGLDFQKQPVFVGGA1STAOA0AFLGHLHOVOLAGTSL 
OAAAOSLNVOSKSNEESGDSOOPSOPSQOPSVOAAlPQTOIiMLA 
GG0ZTGLTLTPA0WLI,LCK?A0AOAOlJUAAAVQQHSASC?OHSAA 
GATISASAATPMTQIPLSQPIQ3AQDUX)b0OLQ0QNLNLQQFV 
hVH P TTN JLQ PA \ 0 F 1 1 SO T PQGQQG LLQA \QNLLTQLP RQ S QAN 
liLQSQPRl \TLTS0PATPTCTIAA7PIQTLPgSQSTFKRI DTPS 
LEEP\SDLEEI J E0FAKTFKORR3KLGFT\0GDAGLAKVKLYGND 
FS PTTI FRFEALNLS FKN MCKLKPLLEKWLNDAENI^S SDS SLSS 
PSALNSPGIEGLSRRRKKRTSIEA\NIRVALEKSFLEN\OKPTS 
EEITMIADQLNMEKGVIRVWFCNRROKEKRINPPSSGG\TSSSP 
1 KA1 FPS PTSLVATTPS LVTSS AATTIjTVSP VLPLTS AAVTNLS 
VTGTS DTTSNNTATVI STAP PASS AVTSPSLS PS PSASAS 7S EA 
SSASETSTTQTTSTPJjSSPLGTSQVMVTASGLQTA/AQLLPFXG 
AACLPANASIAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIQALASGGSLP I TSLDATGNLVFAN AGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTSAES I QNS LFTVAS/iSGAASTTTTASKAQ 


6^23 


3 


2944 


HLLHRWFGTDKOMINFTTGEFOLTEACPYLGTHSEESRFGJLHL 
H LQPLEM KR VG W FTP ADYG KVTSLT L I RNN LTV I DMI GV EG FG 
ARELI>KVGGR1»PGAGGSLRFKVPESTLMDCRR0LKPSKQILS 1 T 
KM FKVEN I GPL P I TVS SL K I NG YNCQG YGFE VLDCHQFS LD PNT 
SRDlSIVFTPDFTSSWVlRDLSLVTAADLEFRFTLKVTliPHHIJ, 
PLCADWPG PSWEESFWR LT V F FVSLSLLGVI LI AFQQAOY I LW 
EFMKTROR QNAS S S S QQtWG P MDVI S PHS YKSNCKMFLDTYG PS 
DKGRGKNCLPVNTPQSR10NAAKRSPATYGHSOKKHKCSVYYSX 
HXTSTAAASSTSTTTEEKGTS PLGSSLPAAKEDl CTDAMRENV7I 
SLRYASGI NVNLOXNLTLP KN LLNKEENTLKNT I VFSNPSSECS 
MKEGIQTCMFPKETDIKTSENTAEFKERELCPLXTSKKLPENHL 
PRNSPQYHQPDLPEISRK>IJJGNNCX)VPVXKEVDKCENLKKVDTK 
PSSEK K 3 H XTS REDMFSEKOD I PFVEOEDPYRKKXLQEKREGNL 
QNLNMSKSRTCRKNKXRCVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEX 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
\T3AQHFLPAGDSVSQNDFPSEAPISIjNLSHNICNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 

NGVPCVIQESAP vhns fi dksatcegqfssaycplelndynapp 

EENMN YANG FPCPAD VOTDF I DHNS QSTWNTP P\NMPAS \ WGNA 
OFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\OSDVYENCCP3N 
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BNSOOCID: <WO_01W312A1_I_: 



« 



WO 01/53312 



PCT/US00/34263 



SLQ 
3D 
NC: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Preen c tea end 
nucleot iat 
location 
corresponding 
to first 
amino acic; 
residue o'. 
amino acic 
sequence 


Arr.ino acid segment containing signal peptide 1 
(A=Aianine, (^Cysteine, D^Aspartxc Acid, E= 
Glutamic Acid, F- Phenyl a 1 anine, G~Glycine, 1 
H=Histidine, l=lsoieucine, K=Lysine, 
L>-l>eucine, M=rtethionine, N=Asparagine , 
P^Proline, Q=Glutanme, R=Arginine, 
S=Serine, T=Threon:ne, V*Valine, 
W^Tryptophan, V* Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
Vpoosible nucleotide insertion) 








t)TTPMCn /TUMFMf>2l\ W^Jf P VVD/~JT\ MDrDn YMMt v ^7UlT f T , T\ A 
fl 1 JCMoU/ J nritflyft\VVV.Atz Iruf \IMrr K/ilrlWJj-J J. n I 1 1 \M 

NRNAN FPLSR DS S YCGNV 


6124 


1573 


236 


SDEALRLAGERGMGRVCLFEISLSHGRWySPGEPLAGTVRVRl, 
GAPLPFRAIRVTCIGSCGVSNKANDTAIVWEEGYFNSSLSLADK 
GS LPAGEHS FP FQFLLPATA PTS FEG PFGKI VHQ VRAA1 KTPR ? 
SKDHKCSLVFTflliSPLNLNSI PDIEQPNVASATKKFSYKLVKTG 
SWL»TASTDLRGYWGQALOLHADVENQSGKDTSPWASLLQKV 
SYKAKRWIHDVRTJAEVEGAGVKAWRRAQWHEQILVPALPQSAL 
PGCSLJHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 
R PGAASWGPTPGG\ PSAPPOEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGY PYEAPPSYEQSCGGVE 
PSLTPES 


6225 


a 




KTCPKiTCAFTVSVPDSCCRVCRGDGELSWEHSDGDJFRQPANR 
EARHS YHRSH YD P PPSRQAGG LS R FPGAR SHRGALMDSQQASGT 
1 VOI VINNKHKHGQVCVSNGKTYSHGESWHPNLRAFGI VECVLC 
TCNVTKQECKKI HCPNRYPCK YPC/KIDGKCCKVCPG/ KKAXEEL 
PGOSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
E VHVWT1 RKGI LOHFHI EK I S KRMFEELPHFKL.VTRTTI.SOWKI 
FTEGEAQISOMCCSRVCRTELEDLVKVLYLERSEKGHC 


6 126 


1224 


389 


RLLSEAPCPRSRPRFQMNPEWG0AFVHVAVAGGLCAVAVFTG1F 
DSVSVQVGYEHYAEAPVAGLPAFliAMPFNSLVNMAYTLLGLSKL 
KRGGAMGLG PR YLKDVFAAMA LLYG PVQWLRLWTQWRRAAVLDQ 
WLTLPIFAWPVAWCLYLDRGWRP\WLFLSLECVSLASYGLALLH 
POG FEVALG AHWPAVGOALRT \HRHYG /SATPSATYLALGVLS 
CLGFWLKLCDHQLARWRLFCCLTGHFWSKVCDVbQFHFAFIjFb 
THFNTHPRFHPSGGKTR 


6127 


1335 




VLPRRCLV FWKTMDSSREPTLGRLDAAG FWQWIQR ¥ DADEKGY 
I EEKE LDAF FLHMU^KLGTODTVMKANLH K V KQQFMTTQD AS KD 
GRIRMKELAGMFLSEDENFLLLFRRENPLDSSVEFMOIWRKYDA 
DSSGF I SAAELRNFLRDLFLHHKKAi SEAKLEEYTGTMMKI FDR 
NKDGR LDLNDLAR I LALQEN Fl.bQFKMDACSTEKRKGDFEKI FA 
YYDVS KTGALEGPXEVDGFVKDMMELVQPSl SGVDLDKFRE I hh 
RHCDVNKDGK1 OKSELALCLGLK1 UP 


612B 


2511 


843 


TCRN S R RQXiER W VW S S QQ VQAH GRNVRAPRLGK 1 AMGLEMS S XD 
SPGSLDGRAWEDAOKPOSAWCGGRKTRVYATSSRRAPPSEGTRR 
GGAARPEKTAKEGPPAAPGSLRWSGPLGPHACPTALPEPQVTSA 
MSSQWGIEP1>YIKAEPASPDSPKGSSETETEPPVAIAPG\PAP 
TRCLPGHKEEBDGEGAGPGEOGGGKLVLSSLPKRLCLVCGDVAS 
GYKYG VAS CEACKAFFKRTI QGSI EYSCPASNECE I TKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGROKYKRRPEVDPLPFPGP 
FP AG PLAVAGGPR KTAAP VN ALVSHLLiWEPE KL YAM PDP AG PD 

rUX TJlVWATT mT PDOPTWT T QUHVC TDr2T?QQT j^T.cinnfctcuT.OQ 
vnurnvAi iA-L/IjJr UXEiX VV] J orVAMvo X ruf50lJOi>SWvrw V 

VWME VL VLGVAOR S LTLQDE LA FAE YLVLDE EGAR PAGLG ELG\ 
AALLOLVRRliQALRLEREEY VJ ,LKALALANSDSVHI EDEPR1»WS 
SCEKLLHBALLEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
KVIAHFYGVKLEGKVPKHKLFLSMLEAMMD 


612S 


1764 


771 


ARFARS AHEGKMP K KXTGAR KKAENRREREKQLRAS RST I DLAK 
KPCNASKECDKCQRRQKNRAFCYFCNSV0KLPICAQCX5KTKCMM 
KSSDCV 3 KHAGVY STGLAMVGA1 CDFCEAWVCHGRKCLSTHACA 
CPLTDAE C \ VECERGWiDHGG R I FSCS FCHN FLCEDDQFEHQAS 
C0VLEAE7 FKCVS CNRLGOHS CL.RCXACFCDDHTRSXVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
Y W KNLSS DX YGDTS YHDEEEDE YEAEDDE EEEDEGR KDSDTES S 
DLFTNLNLGRTYASGYAHYEEOEN 


6130 


3 


577 


GRGGTMRE YKVVVliGSG\GVGKS ALTV\OFVTCTFI EK YDPTI E 
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BNSDOCIO: <WO 01S3312A1_L> 



WO 01/53312 
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SEQ 
ID 

NO: 


Predicted 
bey i Tin i no 
nucleotide 
local ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuci e ct ide 
lccrt ion 
corrt t^pondinc 
to first 
amine acic 
residue ot 
amine acid 
sequer:ce 


Amino acic segment containing sional peptide 
(A^Alanine, C=Cysteine, D=As:3ertic Acic, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I- Iscleucine , K-bysine, 
L= Leucine, M-ttethionine, N=Asparagine. 

Proline, 0=Glut£mine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
X-possibie nucleotide insertion) 




! 






DFYRKE I E V\ DSS PS VAGI S WTQQGTEQF \ ASMRDLY J KKGOGC 
ILVYSLVNOOSPO\DIKFMRDOIIRVKVSEXVPVJ\LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVDELFAE1VRC 
MNYAAQPDKDDPCCSACN1Q 


6131 


3 


.1811 


SSPREKTSDSSKRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PR Q f .S AMR T.L.'DI.APdRLPRtt*? PRMLPSCS PAI .L.I .T.VljRPr'T/JVF 
G VAAGTR K PNWLLLTDDQDEVLGGMTPLKKTKAL1 G EMGMTFS 
S AY VPS ALCC PSRAS I LTG KY PHNHHWNNTLEGNCS S K S WQK 1 
0EPNTFPA1LRSMCGY07FF\AGKYLNEYGhPDAGGLEHVPLGW 
S Y W YALEKN S KY YN YTL>S I KGKAR KHGEN Y SVDYLTDVLANVSL 
DFbDYKSNFEPFFMMTATPNAPHSPWTAAPQYOKAFONVFAPRK 
KNFN X HG'i'N KHWLI ROAKTPMTNS S IQFbDN'AFRKRWQTLLSVE 
DLVEKLVXKLEFTGELNNTYI FYTSDNGYHTGQFSLP 3 DXRQLY 
EFDlKVPLLVRGPGlKPNOTSXMbVANIDbGPTILDlAGYDLNK 
TQMDGMbLLPI uKOASNbTWKSDVlJVJiJU^E^RNV^L'^ , 1 e,PSLi> 
PG VSQCFPDCV CEDAYNN TY ACVRTMS ALVJN LQ Y CE FDDOEVF V 
EVYKbTADFDQl TNI AKTI DPELLGKMN YRbMMLQS CSG PTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLb 


6132 


OS 


I 241 


AAGbbPPGLVPEDPPJ^TRNLbPFGIOGPPFAbSRPbFSCVSSGlv 1 
AWEAMEPEFLYDLLOLPKGVEPPAEEEbSKGGKKKYLPPTSRKD 
PKFEELQKPA\ VLMEW1 NATLLPEHI WRSLEEDMFDGLI LHKL 
FQR LAALK 1 .EAEDI ALTATSQKHKLTWLE AVNRS \ CS WR SGR F 
SGA/WESIFNKDLLSTLHbLVALAKRFOPDLSLPTNVQVEVITI 
ESTKSGbKSEKLVEOLTEySTDKDEPPKDVFDEbFKLAPEKVNA 
VKEAIVNFVNQKLDRbGLSVONLDTOFADGVILLLLIGObEGFF ' 
I^LKEFYLJTNSPAEMLKMVTLALELL/IGKGPAOLPC/LALK/ 
TIVNKDAKETLRVLYGLFCKHTQKAHRDRTDjGAPN 


6133 


2 


42S6 


FVKGSMADTDbFMECEEEEbEPWQKlSDVI LDSWEDYNSVDKT , 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSn VSSSGAQNSDSTK ! 
KTLVTL I ANNNAGN PLVQQGGQ P LI LTQN PA PGLG TMVTQ P VLR ! 
PVQVMQNANHVTSSPVASQPIFITTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVR P 1 TLVP A PGTQFVK PTVGV PQVFSQMTP V R PG 
STMPVRPTTNTF1-JV1PATLTIRSTVPQS0S00TKSTPSTSTTP 
TATCPTSI^OLAVOSPGOSNOTTNPKLAPSFPSPPAVSIASFVT 
VKRPGVTGENSNEVAKLVNTLNTI PSLGQSPGP VWSNNS SAH\ 
GSORTSGPESSMKVTSSI PVFDLQDGGRK3 CPRCNAQFRVTEAL 
RGHMCYCCPEKVEYQKXGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKblMbVDDFYY 
GRDGGKVAObTK FPXVATS FRCPHCTKRLKNNI RFMNHMKKHVE ! 
LD00NGEVDGHT1 CQHCY RQFSTPFQLQCHLENVHS PYESTTXC j 
KI CEWAFESEPbFLOHMKDTHKPGEMPYVCOVCQYRSSLYEEVD 5 
VHFRMlHEDTRHLLC?YCLKVFKNGNAFQOHYMRHOKR\NVYH\ i 
CMKCRVQFLFAKDKIEHKLQHHKTFRKPKObEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALOEAAPbTSSMDPLPVFLYPPVORS 
I OKRAVRKMS VtfGRQTCLECS FE I PDFPNHFPTYVHCSLCR YST 
CCSRAYANHMINNWPRKSPiabALFKNSVSGIKIACTSCTFVT \ 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR « 
WVKNMYP P P S FPTN KAATV KS AG ATPAE PE E LbTP LA PALPSPA ' 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTOEPE 
LASGGGGSGGVGKKBQLSVKKLRVVLFALCCOTEOAAEHFRNPQ 
RR1RRWLRRF0ASCGENLEGKYLSFEAEEKLAEWVLTQREOOLP 
VNEETLFQKATK1GRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFI DFVQRQI HNQDLPLSMI VAI DE I S LFL 
DTEVLSSDDRKENAL0TVGTGBPVICDWLA1 LADGTVLPTLVFY 
RGQMDOPANMPDS 1 LbEAKESGYSDDE I WELWSTRWQKHTACQ 
RSKGMLV^DCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKlOPb 



452 



BNSDOCID: <WO_0153312A1J.> 



WO 01/53312 PCT/LSOO/34263 



SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
1 ccat ion 
correspond inc: 
tc first 
nmino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca t ion 
corresponding 
to first 
etr.ino acid 
rasi&Je of 
smino acid 
sequence 


Amino acid seornent containing signal peptide 
(A^Alanine, C-Cystemc, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine , G=Glycine, 
H=Histidine, 1 » J 5;ol eucx ne, K=Lysine, 
L=Leucine, M-Methionine , N^Asparagi ne , 
P=Proline, Q=G1 ut amine. , R=Arginins?, 
S=Serine, T= Threonine, V=valine, 
W=Tryptophan, Y=Tyro£ine, X«UnXnown, **Stop 
Codon, /^possible nucleotide deletion, 
\=poc£>ible nucleotide insertion) 








DVC I K KT V JCN FLH K K W K EQAK EWADTACDSDVLLQLVLVW LGE V 
LGVIGDCPELVQRSFLVASVLPGPDGNIKSPTRNADMOEELIAS 
LEEQLKLSGEHSESSTPRFRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLME] 


6134 


2 


42S6- 


FVUGSMADTDLrMECEEEcLEPWOKlSDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTb J ANNNAGN F LVQQGGQ PLI LTQNPAPGLGTMVTQPVLR 
PVQVMOKANHVTSS P VASOPI F I TT0GFPVRNVRPVOKAMN0VG 
IVLNVOOGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STWPVR PTTNTFTTV3 PATLT2 R S T V PQSQSQQTK STPSTS TTP 
TATO PTS LGQLAVQS PGQSNOT TNPKLAPS FPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVWTLNTIPSU50SFGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSSIPVFDLODGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEY0KKGK5LDSEPSVPSAAKPPSPEKTAPVAS 
/ TK P S ST P 1 PALS P P Y / T KV PE PNEN VGDAVQT KL I MLVDD FY Y 
GRDGGKVAOLTNFP KVATS FRC PHCTKRLKNNI R FMNHMKiJ ilVE 
LDOONGLVDGHTICOHCyRQFSTPFOLQCHLENVHSPYESTTKC 
KICEWAFFSEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRM I H EDTRHLLCF Y CLJTv/FKNGNAFOQHYMRHOKR \ N VYH \ 
CNKC R V 0 FLF A KDK 3 EH XLQHHK TFR KP KQLEG LK PG TKV T I RA 
SRGQPRTVrVSSNDTPPSALOEAAPbTSSMDPLPVFbYPPVQRS 
IQKRAVR KMSVMGRQTCLECSFEIPDFPNHFPTYV>1CSLCRYST 
CCSRAYANNKINNHVPRXS PKYLALFKNSVSG1 KLACTSCTFVT 
S VGOAMA KKLVFNP SHR S S S I LPRGLTWI AHSRHGQTRDR VHDR 

MVKWMYT- PPQ RPTNVCZXaTVK^AQftTPAFPFPT.T.TPI^PZkT.P^PA 
n V IMM l it x i Off J rv/vrvrt J v JxOrJOfA i r McrCiDUu .1 r u**r- f-\y> r o f a\ 

STATPPFTPTHPOAOALPPLATEGAECLhAnDDODEGSPVTCEPE 
1ASGGGGSGG VGKKEQLSVK KLRW1>FAI*CCNTEQ AAEHFRN PQ 
RRIRRWLRRFOASOGENLEGKYLSFEAEEKLAEWVLTQREOOLP 
VNEETLFOKATK IGRSLEGGFK I S YE WAVR FML R HHLT PKAR RA 
V AHT1>P K D V AEN AGL»F 1 D FVQRQ 1 HNQDLP LSM I V A I D E I S L»F L 
DTEVLS S DDRKENAbOTVGTGE P WCDWLAI LADGTVLPTLVF Y 
RGQMDQ P AKM PDS I bLEAK ESG Y SDDEI MELWSTR VWQKHTACQ 
R S KGML VKDCH RT H L S EE VLAMLS AS S TLP AW P AG CS S K 3 Q P 1* 
DVCI KRTVXNFLHX KWKEOAREKADTACDSDVLLCLVLVWLGEV 
LGVIGDCPELVORSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLWE3 


6135 


2 


4256 


FVHGSMADTDbFMECEEEELEPWOKISDVlEDSWEDYNSVDKT 
TTVSVSWPVSAPVPIAAHASVAGHLSTSTTVSSSGAONST3STK 
KTLVTL I ANNNAGNPLVQOGGO PL I LTQNPAPGLGTMVTQPVLR 
PVQVMQN ANH VTSS P VASQP I FI TTQGFPVRKVRPVQNAKNOVG 
IVLNVQOGQTVRPITLVPAPGTQFVKPTVGVPQVFSOMTPVRPG 
S TM P VR P " TN TFTT V I PATLT I R S TVPQSQSQQTKSTPSTS TTP 
TATQPTS LGQLAVQS PGQSNQTTN PKLAPS FPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRJC1CPRCNAOFRVTEAL 
RGHMCYCCPEMVEY0KKGKSLD5EPSVPSAAKPPSPEKTAPVAS 
/THPSSTP I PALSPPY/TKVPEPNENVGDAVQTKblMLVDDFYY 
G RDGGKVAQLTNFPKVATS FRCPH CTKRLKNNIRFf5NHMXHHVE 
LDQQNGEVDGHTICOHCYROFSTPFQLQCHLEKVHSPYESTTKC 
XICEWAFESEPLFLOHMKDTKKPGEMPYVCOVCQYRSSLYSEVD 
VHFRM I HEDTRHLLCPYCLKVF KNGNAFCX)HYMRHQKR\ NVYH\ 
CNKCRVOFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVS SNDTPPS ALCEAAPLTSSMDPbPVFLY PPVORS 
1 QKRAVKK>jSVMGROTCLECSFE I PDFPKHFPrYVHCSLCRYST 
CCSRJ\YANHMINN}IVPRXSPKYLALFKNSVSGI KLACTSCTFVT 



453 



BNSDOCtD-. <WO <M?>33t2At J_> 



WO 01/53312 



pCT/I)SO»/342f».' 



SLC 
ID 

NO: 


Predi cted 
beginning 
nucleot ior 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Precictec end 
nuclect 2 ae 
locat i on 
corrtsporuing 
to firr-t 
amino aod 
residue of 
amino ac:d 
sequence 


Ammo ac;d segment containing signaj peptide 
(Alanine, C^ Cysteine, D-Aspartic Acic, e = 
Glutnmic Acid, F=Phenylalanine, G-Glycit.^ . 
H«Histicinc, 1-3 ?>o1 cucinc, X=Lysine, 
h- Leucine, K=Methionine, N=Asparagi ne , 
P=Proline, Q=Glutamine, R=Arginine, 
S* Serine, T=Tnrconine, V^Valmc, 
W=Tryptcphan, Y = Tyrosine, X=Unicnown , * = £top 
Codon, /^possible nuclpotide deletion, 
\i=possibie nucleotide insertion) 








S VG DAMA K.H L V FN P £ H RS S S I L PRGLT W 1 AHS RHGQTR D R V H DR 
NVKNKYPFPSFFTNKAATVKSAGATPAEPEELLTPJiAPAL^SPA 
STATFPPTPTHPOAuALPPLATEGAECLNVDDQnEGSPVTOEPE 
LASGGGG5GGVGKKE0LSVKKLRWLFALCCNTEQAAEHFKNPQ 
R R I WLF k FQASOG F.KILEG KY LS FEAEEK LAEWV LTQR ECQLP 
VN E ZThiX KATK 1 G R S LEGG FKIS YEWAVR FMLRH i t LTPHARRA 
VAHTLPKLA'AENAGLFI DFVORQI HNQDLPLSMI VA3 DEI <- LFL 
DTEVLS S DDR KEKALOT VGTGE PWCDWLAI LADGTV LPT LVFY 
RGQKDOPA>n4?DSlLLEAKESGYSBDEIMKLMSTRW;OKHTAC0 
R S KGMLVKDCHRTKLS EEVLAMLS ASSTLPAWPAGC S SX 3 QPL 
DV C 3 KR7 V KW FLHK K W XEQARE M ADTA CDS DVL.1 /CIA* LVVJ LGBV 
LG V 3 GDC P T.h VQRS FLVAS VLPC PDGN 3 KS PTRN ADMQEE LIAS 
LEE0LKLSGEH$ESSTPRPRSSPEET1EPESLHQLFEGESE7ES 
FYGFEEADLDLME3 


61.16 


1704 




FGVRMAXEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGCVAS 
ShFRGEHKi'f(GGTGRLASLFSShZPQIQPVYVPVPK\ESAl J ASA 
DLE EE I HC KQGQKR KNS OPG V KVADR KI LDD7EDTWS OR X K 1 0 
lNQEEERLKNERTVFVGNLPVTCNKKKLKSFFJCEYGClES\ r RFR 
SLI PAEGT);SKKLAAJ KRKIHPDQKNINAYWFKEESAATCALK 
RNGAQIADGFRIRVDLASETSSRDKRSVFVGNLPYKVEESAjEK 
HFLDCGSIKAVRIVRDKMTGIGKGFGYVLFBNTDSVHLAIiKLNK 

selmgrklkvt^jrsv^kekfkqonsnprlkkvskpkoglnftskt 
aeghpkslfigekavllktkkkgokksgrpkxorxqk 




141 




ralrkrrccpgrrgalgsgpgporrpgrvpeerpapfkerkhpg 
mwnwli va* cla\llglpgkaqeloghvs\ i 3 lageoi gdlakk 

YLWOGVLFCLYLPEAGRGHSFSFHGAAiTAPKOGQELXAK/.LES 
LS C F' Kl>MA P S H CAE W K DQ FL»QI»S Q Y RQLKTAED Y CALN KD J EAQ 
LCUAGLREAGG I FY FS VP P FA YED 1 ARN J NSSCR PGFGAWLR W 
LEKPFGHDHFSAOQLATELGTFFQEEEMYRVDHYLGKQAVAQIL 
PFRDONRKAI.DGLWNRHifVERVEI 3 MKETVOAEGRTSFYEEVGV 
IRDVTjQ^nlTEVLTLVAMELPHNVSSAEAVLRHKLOVFOAl.RGL 
0RGSAWG0YQSYSEQVRRELQKPDSFHSLTFTFAGVLVH3DNL 
RWEGVPF I I,y SGKALDER VGYAR I LFKHQACCVQS E KHKAAAQS 
QCLFRQLVPH 3 GHGDLGS PAVLV5R NLFRPSLPS SW KEMEG P PG 
LRLFGSPL? f.YYAYSPVR ERDAHSVLLSHI FHGRKNFFITTENL 
LASWN FWT P L-LE SI AHKAPRLY PGGAENGR LLD FEFS f>GRLF FS 
000PEQLVPGPGPGP«P£DFQVLRAKYRESSLVSA«SEEL1£K^ 
AND! EA7AVRAVRR FGQFJ ILALSGGS S PVALFQQ1ATAHYGF PW 
AHTHLWLVDLRCVPLSDP ESNFOGIjOAHLLOHVRI PY YNIKN Ai4 
PVHLO0RLC AEEDQGAHI Y ARE I SALGANS S FDLVLLGMGAJX5H 
TA^LFPOSPTCLDGEOLV^TTSPSOPHRRHSLSLPLINRAKKV 
AVLVWGRMKRE1TTLVSRVGHEPKKMPISGVLPHSG0LVWYMDY 
DAFIjC- 


6138 


4507 


934 


EFSKiTDRWCNAVOGVRyRKGDVDGLVROWnDFTTSVENLFRFL 
TTJTSHLLSAVKG<?ERF£LYOTRSLIHELKNKE2HPORRRTTCAL 
TLEAC'EKJLLLTTDLKTXESVGRRI SQLQVS WKDMEPCJLAEMJ KQ 
FQSTVETWDQCEKKIKELKSRliQVliKAOSEDPLPELHEDLHNEK 
ELI KELEOS: J^WTCWLKELQT>1KADLTRiJVbVEDVKVLKECI E 
HLHROWEDLCLRVA1RK0EIEDRLNTWWFNEXNKELCAWLVOM 
ENKVLCTAJD2 SIEEMIEKLQKDCMEElNLFSENKLOLKQMGDQL 
I KASNKSRAAEI DDKLNK INDRWQHLFDVI GS RVKKLK ETFA FI 
OQLDKNMSWLRTWLAR I ESELSKP WYDVCDDQEIQKRLAEOOO 
LQRDI EQHSAGVESVFNrCDVLLHDSDACANETECDS 3 QQTTRS 
U)RRWRNICAMSMERRMK1EETV|RLNQKFLDDYSRFEDWLKSAB 
RTAACPNSSE'/LYTSAKEELkRFEAFQRQIHERLTOLELINKQY 
RRlJyiENRTDTASRLKOMVHEGNORWDNLQRRVTAVLRRLRKFT 



454 



BNSDOCID: <WO 01 63312A1_I_> 



W O 01/53312 



PCT/US00/3-I263 



SEQ 
ID 
NO: 


Predictec. 
beginning 
nucleotide 
locat ior. 
corresponding 
to first 
amino acic 
residue o: 
amino aci<; 
sequence 


Predicted end 
rrucleot ide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Anur.o acid sfgrnent: containing signal peptide 
(A^Ainnine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylal anine , G = Glyc.:ine, 
HaHistidine, I=Isoleucine , Ks Lysine, 
Lsl.eucine, M=Methionine, N=Aspcri.a:ne , 
P=Proiine, C?=Glutamine , R=Argininc. 
S=Serine, T=Threonine, V=Valin€, 
W^TryptOphan, Y= Tyrosine, X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=poesib)c nucleotide insertion' 








NOREEFBGTRES I bVWLTEMDLQUTNVEHFSES DADDKMRQLNG 
FO0EITLNTNKIDObIVFGEOLI0KSEP\LDAVLIEDELEELHR 
YCOEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMEDPREIQT 
DSWRKRGESEEPSSP0SLCHLVAPGKERSGCETPVSVDS\3PLE 
WDKTGRRGGPSSSH\EEDEEAOYy\SALSGKS3SDGHSir7HVPDS 
PSCPEHHYKOMEGDRNVPPVPPASSTPiKPpyGKLLliPPGTDGG 
K EG PRVLNGNPOQEDGGbAGl TEQQSGAFDR WEM 3 QAQElAHNK 
bKIKQUbOOLNSDI SAITTWLKKTEAELEMLKMAKPPSDIOEI E 
LRVKRLOEILKAFDTYKALWSVNVSSKEFLCTESPESTELOSR 
LR0LSLLWEAAQGAVDSWRGGLRQSLMQCODFHOLSQNLLLWLA 
S AKNR R0 KAHVTD P KADPRALLECR R ELMC/LE KE LVERQPQVDM 
I,OHISNSLLIKGHGEDCIEAEEKVHVl\EKKLKCLREQVSQDLM 
ALOGTCNPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPORSFLSRWRAALPLOLLLLLLLLLACLb 
P S S E E DY S CTQ ANN F \ ARS F Y PMLR YTNGP P PT 


6139 




1131 


LGDWVW5RTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSVIKTL/LAGW1GGAA55VIVGHPLDTVKTRLQAGVG 
YGNTLSC J RWYRRESMFGFFKGMSFPLASTAVYNSVVFGVFSN 
TQRFbSQHRCGEPEASPPRTLSDLLLASMVAGWSVGl.GGPVDL. 
IKIRL0MQ7PPVSGROPRFEV0GSCSCC\EPAYOGPVKCITT2V 
RNEGLAGLYRGASAMLLRDVPGYCLYFIPYVF^SEWJTPEACTG 
P£ P CAW? L»AGGMAG A 1 S WGTAT PMDW3CSR L0 ADG VY LH KY KGV 
LDCISQSYOKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSF 


6140 


694 


136 


RPEL/ELWRLRSRSWRPLGVPRRCHRRNWKEPVRAOPLSVTVWAP 
RCQRP/QPFAPEPSSPNAAVPEA3PTPRAAASAAJ,ELPLGPAPV 
SVAPQAEAEARSTPGPAGSRXGPETFRQRFRQFRYQDAAGPREA 
FR0LREL/SPR0WLRPDI\RTKE0\IVEMLVC>EQLLAILPEAAR 
ARRIRRRTDVRITG 


6141 


2 


984 


AOVGPRSRPCKMPLKLRGKKKAKSXETAGLVEGEPTGAGGGSLS 
ASRAPAKRLVFHAQLAHGSATGRVEGFSS1QELYAQIAGAFEIS 
PS EI LYCTliNTPKIDMERLLGGQLGLEDFIF/vHVKGIEKEVNVY 
KSEDSIjGLTITDNGVGYAFIKRIKDGGVIDSVKTICVGDHT ESI 
NGEN3VGWRKYDVAKKLKELKKEELFTMKL1EPKKAFEJELRSK 
AGKESGEK3GCGRATLRLRSKGPATVEEMPSETKAK\AJEKIDD 
vlelymgi rd: DIATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFDVWGV3GDAKRRGL 


6142 


11C 


602 


EAEGEOVCGAKCCGDAPHVENREEETARIGPGVMESKEERALNN 
L I VEN VJi OENDEKDEKEQVANKGEPLALPLNV S E Y CVPRGNRRR 
FRVHOPIL0YRWDIMHRLGEP0ARMREENMER3GEEVROLMEKL 
REKOLSHSLRAVSTDPPHHDHHDEFC\LMP 


6143 


280y 


2?0 


FRMR I FLHCPWNOQMWKI WNLLETSLESCKAHLS 3 QKLLKER\Q 
\OLPVFKHRDSIVErLKRHRWWAGET\GSGKSTQVPHFLLED 
LLLNEWEASKCNI VCTQPRRI SAVSLANRVCDELGCENGPGGRW 
SbCGY01RMESRACESTRLLYCriGVLLRKLOEDGLbSNVS/HM 
FIVDEV\HER\SV0SDFLLIILKEJLQKRSDLKL1LMSATVDSE 
KFST Y FTH CP I LR I SGRSY P V E V FHLED 1 1 E ETG F VLE KDS E Y C 
QK FLE E E EEVT3 NVTS KAGG I KK YOE YI PVQTGAKADLN PFYQR 
YSSRTQHAILYKNPHKlNLDLILEI,LAYIvDKSFOFRNIEGAVLI 
FLPGIAH 1 0QLYDL1>SNPRRFYSERYKVIALHS 3 LSTQDCAAAF 
TLP P PGVR KI VliATNI AETG I TI PDWFVI DTGRTKEN KYHES S 
0MSSLVETFVSKASAUJROGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPE I LRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMNLLR X 1 G AC E 1.N E P KL.T PLGQH LAA1»P VNV K I G KML I PGA I F 
GCLDP VATLAAVMTEKSPFTTP I GRKDEADI^KSAIAMADSDHL 
TI YNAYLGWKKARQEGGYRSEITYCRRNFLNR TSLLTLEDVKQE 



455 



BNSOOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



ID 
NO: 


Preaictec 
beginning 
nucleotide 
j ocat ior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequencf 


Predicted end 
nucl eot.-de 
locat :cn 
corresponding 
to first 
amino acid 
residue of 
amino ?cid 
sequence 


Amino acid secment contemning sacnai peptide 
(A^Aismne, C-Cysteine, D-Aspart.ic Acid, E= 
Glutamic Acid, F= Phenyl ?. ) canine , G-Glycine, 
H=H:ctidine, I=Isoleucine , K^Lysinc, 
L= Leucine, H= Methionine, N-Asparacme, 
?=?roline, 0=Glutamine, R=Arginint, 
.^Serine, T=7hreonine, v^Valine, 
W= Tryptophan, Y=Tyrosine, X-Unknown, **Stop 
Codor., /-possible nucleotide deletion, 
\-possible nucleotide insertion; 








h I KLVKAAG FS S S TTSTS W EGNR ASCTLS FQEI ALL KA VLVAGL 
YDNVGK J I V TK£ VDVTSKLAC I V_-?AQG KAQVH PS S VNRDLQTH 
GWLLYQEKIRYA*WLR£TTLJTPFPVLLFGGDIEVQHRERLLS 
IDGWiyFQAPVKlAVIFKQLRVLIDSVLRKKLENPKMSLENDKl 
LQ 1 2 TEL I JCTENN 


6144 


1289 
a 


568 


sgpgsmsgqrvdvkwmlgkeyvgktslveryvhdrflvgpyqn 
vsasggarhggrg sggpv1 ctygfclfpl va \ ti g a afvakvms 
vgdrtvtlgiwdtagseryeamsr:yyrgakaaivcydltdsss 
rerakpwvkelrsleegcqi ylcgtksdlleedrrrrrvdfhdv 
odyadnikaqlfetssktgcsvdelfqkvaedyvsvaafqvmte 

DKGVDLGQKPNPYFYSCCHH 


614S 


1109 


1S6 


GGHDLSELERDKTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
G PMV YA 1 CY CP LPRLADLEALKV ADS KTLLE S ER ER L FAKMEDT 
DFA/GWALDVLSPNLISTSMLC-RVK'YNIjNSLSHDTATGLIQYALD 
OG VNVT QVFVPT VGMPETYQAR LQQS F PG I E VT VXAKADALYPV 
VVSAAS 1 CAKVARDQAVKKWOFVE>XQDU)TDYG\SGYPNPPQD 
/7KAV71>KEHVEPVF\GFP\OFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSO/vRPRSSHRYFLERGLESTTS 
L 


6146 


42f 


761 


LKKKGKEKAEAOQVKALPGPSLDv^'HRSAGEEEDGPVLTDEOKS 
R/YPGHEAHDOGG\WDAROSIIRKVVDPETGRTRL.l KGDGE-VLE 
EJVTKEKHREINKQATRGDCLAFCMkAGLLP 


6147 




2304 


GTRQLPPPSPGSGPGDSPKGPEGF.APERRRKAKGMLKLYYGLSE 
GEAAGRPAGPDPl.DPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMVRQ1RALDSDMQTLVYENYNK F2 SATDT2 RIKMKNDFRKME 
DEMDRLATNMAVI TDFSAR ISATLQDRHER1TKIAGVHALLRKL 
OFLFELPSRIiTKCVELGAYGOAVRYOGRAOAVLOOYCHLPSFRA 
1 QDDCQ V I TAR LAQQLRQR FR EGG S G A P EGA E C V E L LLA 1 jG E P A 
EELCE2FLAHARGRLEKELRJJLEAELGPSPPAPDVLEFTDHG\S 
SG F VGG LCOVAAA YQELFAAOGPAG AE KLAAFAROLGSR Y FALV 
ERR1^0E(X^GDNSIXVRALDRF>1RRLRAJPGALLAAAGLADAAT 
E3VERVARERLGHHLOGLRAAFLGCLTDVR0ALAAPRVAGKEGP 
G1AELLANVASS2 LSH1KASLAAVKLFTAKEVSFSNKPYFRGEF 
CSQC-VREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLDYETATISYILTLTDEOFLVC'DQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVISOMLRKSVETRDWLSTLEPRhTVRAVMXKV 
VEDTTA I DVOVLPRLAGVALTQAGGTVPSRGAG AAEDHWQSLPG 
GGDMCIWASHGAS.SVARASVREPOGNKSPRMNTKRAGECLCPRS 
CSFSAQDYDI FA PI LPVEKQRLRVI'QEVRAGLVLVLKIRPQTNS 
CILPLPHSTGSINSDHVPTK 


614B 


305£ 


3S3 


VPA VGGTFADGAKG EAEKFH Y IYSCDLDI NVQLX J GS LEG KR EQ 
KSY KAVLEDPMLK FSGLYQETCSDL YVTCQVFAEG KPLALP VRT 
S YKAFS TRWNWN E W LKLPVK Y PDLPRNAQ VALT 1 WDVYGPGXAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSOMDOKPTKTPGRT 
S STLSEDOMSPLA X LTKAHROGHKVK\TDWLDRLTFR El EM I NES 
V KRS S NFM YLMGG FRCVKCDDKE YG I VY YEKDGDES S P I LTS FE 
LVKVPDPOMSLENLVESKHHNLPR<?LRSGPSDHDLKPYPSPRDO 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNODKALTKILTSVIW 
DLPOGAKOALALLGKWKPKDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLM YLLQLVQALKYEN FDDI XNGLEPTK KDSQSS VS 
ENVSNSGINSAEIDSSOIIT/SAPFPSVSSPPP\ASXTKEVPDG 
ENLEODLCTFXISRASKNSTlJuNYLWYVIVECEDODTtJORDPK 
THEWLmmRRFSQALLKGDKSVR^SLlAAOC/TF^RLVHLM 
KAVQR ESGNR KXKNERLQALLGDN£ KKNLSDVEL J PL PLEPQVK 
a RGI 1 PETATLFKSALMPAOLFFKTEDGGKYPVI FKHGDDLRQD 
OL1 L0 3 1 S LMDKLLR KENLDLKLT P YKVLATSTKHGFMQF I QSV 
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SEQ 
ID 
NC : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuc] eot ide 
loca t i on 
corresponding 
to first 
amino acid 
residue ol 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanir.e, C*=Cysteine, D=Aspartic Acid, Ex 
Glutamic Acid, F = Fhenylalanme, G-Glycine, 
H=Histidine, I=hcleucme ( K=Lysine, 
•j=I^ucine, M=Merhionine, N-Asparagine, 
P=Proline, Q=Glutamine, R*Arginine, 

Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *»Stop 
Codon, /*=possibic nucleotide deletion, 
\=poscible nucleotide insertion) 








P VAEVLDTEGS 1 ON FFR KYA PS ENG PNG 1 S AEVWDTYVKSCAG Y 
CV ITY I LG VGDRHLDN L LL>T KTGK.LF H I Dr GYIIiGRDPKPLPPP 
MKLNKEMVEGMGGTOSKOyQEFRKOCYTAKLHLRRYSNXILNL? 
SLMVDANI PDIALEPDK'rV/KKVQDKFRLDLSDEEAVHYMOSLID 
ES VHALFAAVVEQI HK FAQY WR K 


6149 


1 


141? 


RVDPRVRENGTAKP 1 K^GKTSPASKDORTGKKTSVQGQVQKGND 
pQV^nFPcnppQPK^QFr FFnDDF^ui oriPr»rsnTr*jrir>rYrPDPiJT 

DojuDLT tol/r r Or IVootL CiL\JLJ UC^Cj V L>VvTt.Uv>ur StlJUl/X CttrK.nL* 

GHRPLLMDSEDEEEEEKHSSDSDYEOAKAKYSDKSSVYRDRSGS 
GPTQDLNT J LLTSAOLSS D VA VETPKOE EDV FGAVPFFAVRAQO 
PQQE KNEKN LPQHR F P AAGL EQ E E? DV FT KAP FS KKVNVQECHA 
VG P E AHT I PG Y PKS VDV FGS TP FQ? FLTS TS XS E SNE DLFGLV P 
FDEITGS0OQKVK0RSl,OKLSSRORRTKODMSKSNGKRHHGTPT 
STKKTLKPTYRTPERAFRHKKVGRRDSOSSNEFTjTlSDSKENIS 
VA1>TDGKDRGNVLQPE£S1jLDPFGAKPFHSPD\LSWHPP\HCXJ1j 
o \ U1 K/Wnrt - \ Vutr V»K \ rKVNo Jjn^or n2>J\VV LiArnL)LJr\?f\\/k'/ t 
LTELWOS1TPHOS0050PV\ELDPFGAAPFPSKQ 


6150 


372 


37 


msnikky1 idydwkas1 ei eidhdvmteeklhq1nnfwsdseyr 
i ,n khgsvlnavl1mlac>^ll1 a3 ssdlnaygvvcefdwndgkg 
oegwppmdgsegiritd: dtsgif 




1555 


521 


DSNOOSVSGTAASTLLKS FKATI Y YQGTGHVQQFYGVTSPYSQT 
TP P I VQS Y AC/PS LQ Y 1 QG QQ I FTAH PCG VWQ PAAAVT1 * I V A PG 
QPQPI^PSEMWTNNliLDbPPPSPPKPKTlVbPPNWKTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASLEKEAEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSfElAKKSKEVFRKEMSQFIVOCLNPYR 
K PDCKVG \ R 1 TTTEDFKH LAR XLTHGVMNKEL KY CKN PE\ DL EC 
NENVKHKTKEYIKKYMOKFGAVYKPKEDTEFRVTVGPGWEDGWS 
G KTDS R ERKS CGPFCSTP VSTVLIjK I HHPGE FNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPF1CSTGPWPVTR01TARTTCGAVPAKCP 
PWC/DVHEPRCOPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
G F SNCSOHG LCTETGCRCDAG WTGSNCS EECPLG WHGPG COR PC 
KCEHHCPCDPKTGNCSVSRVKOCLOPPEATLRAGELSFFTRTAW 
I^LTLALAFLLL1STAAK-LSLLI>SRAERNRRLHGDYAYHPL0EM 
KGE PLAAEKEQPGG AHN P F KD 


6153 


2 


3368 


GRVGARSPGRAYALLLLLlCFNVGSGLHLOVLSTRNENKiLPKH 
PHLVRQKRAW 1 TAP VALL EGEDLSKKNP I AK I HS DLAEERGLK I 
TY K YTGKG I TE PPFG I FV KNKJDTGELNVTS3 LDREETP FFLLTG 
YAbDARGNNVEKPLELRlKVLniNDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADEPhT7LNS KI SYR I VSLEPAYPPVFYLNKDT 
G E I YTTS VTLDREEH S S Y TLT VSAR EX5N GEVTDK P VKQAQVQ I R 
I LDVNDNI PWENKVLEGMVEENOVNVEVTRIKVFDADEIGSPN 
WLANFTFASGNEGGYFH1 ETDAOTNEGI VTL1 KEVDYEEMKNLD 
FSVIVANKAAPHKSI RSKYKPTPIP1 KVKVKNVKEGIHFKSSVI 
SIYVSESMDRSSKGQ1IGNFQAFDEDTGLPAHARYVKLEDRDNW 
I SVDSVTSEI KLAKLPDFFSRYV0NGTYTVK1 VA1 SEDYPRKT1 
TGTVL1NV ED I NDNC PTLI E P VQT I CHD AEYVNVTAEDLDGR PN 
SGPFSFSVIDKPPX3MAEKWKIAR0ESTSVLL00SEKKLGRSEIQ 
FL I SDNQG FS CPEKQ VLTLTV CE VLHGS \ GCR EAQHDS YVGLGP 
AAIALMILAPLLLLLVPLLLLMCHCGKGAKGFTP3PGTIEMLHP 
WNN EGAP PEDKVVPS PLP^TJQGGSLVGRNGVGGMAKEATMKGSS 
SASIVKG0«EMSEMIX5RKEEHRSLLSGRATQFTGATGAI\MTTE 
TTITARATGASRDVAGAOAAAVALNEEFLKNYFTDKAASYTEED 
ENHTAKDCLLVYSOEETESLNAS IGCCSPI EGEU)DRFU)DLGL 
KFKTLAE VCLGQKI D 1 N KEI EQRQKP ATETSKNTASHSLCEQTM 
VN S ENTYS S GS S FP V PK S LQ EAJJAEKVTQE I VTERS VS S RQAQK 
VATPLPDPMASRNV1ATETSYVTGSTMPPTTVILGPSQPQSLIV 
TER V YAPASTLVDQP YAK EGTVWTERV J QPHGGGSNPLEGTQH 
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SEC 
ID 

NO . 

i 


Predicted 

beginning 

nucleotide 

location 

cor respond ire 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
1 ocd t i or, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence- 


Anino acid seonic " t containing signal peptice 
(/^Alanine. OCyrteine, D=Aspartic Acid, h- 
Glutareic Acid, I ~ Phenylalanine . G*Glycine, 
H=Histidine, l = ^oleucine, K=Lysine , 
L = Leucine, M=Meth:onme, N=Asparagine , 
P=:proline, Q=Gli:t amine, R=Arginine, 
S=Serine, T^Thr tonine, V=Valine, 
VJ= Tryptophan, Y = Tyrosine ( X=Unknown, *=Stor 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LODVpyVMVRER•SFL^PSSGVQPTLAMPNIAVGC)^A^TVTERVb 1 
APA S TLCS S YQ I P TE N £ MTARNTTVSG AG VPG P L P D FG LE ESGK 
SNSTITTSSTRVTKHSTVQHSYS 


61S4 


3660 


214C 


KKKTKMKNTIX^KTVNFGAVJPKPTISDKSHLLQWVSKLDLTDAKN 
SDTAHIKS2E1TSILNGL0ASESSAEDSEQBDERGA0DMDNNGK 
FJESX1 DHLTKNRNDL 3 S KXEQNSSSLLEENKVHADLVl S KPVSK 
SPERLRKDlEVLSEDTDYEEDEVTKKRKDVKKDTrDKSSKPQlK 
RGKRRYCNTEECLKTCr FGKKEEKAXNKESLCWENSSNSSSDED 
EE ETXAKMTPTKKYNG L EEXRX5LRTTGFYSGFS EVAEKR I KLL 
NNS DER I ON S RAKDR K PWS S I QGQW PKKTLKELFSDS DTE AAA 
SFPHPA PEEG VAEESLC'TVAEEESCS PSVELEKPPP VNVDS KPI 
EEKTVE VNDRKAEFPSS GSNFSA* 1 PLPYLHLNRLHQSL * OKGS 
RQQSS VVJS EPLAPNQE EVRS 1 KSETDSTIEVDS VAGELQDLQS 
ERE*LASRF*CQCELEO* *SARTRTS*KSLYRSEKS£RCSGRRK 
FI KKAEKK P * SNSGKQOKEGK 


6155 


869 


12J 


HLLPEljRGKSWITMKYVFyLGVLAGTFFFADSSVQKEDPAPYLV ' 
YLKSHKNPCVGVLI KPS V7VLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQTIN PIC 1 VRYWNY5" IJSAPQDDLMLIKLAKPAMLNPKVQALN 
P\PT7NVRPGTVCLL?C-LDWSQEKSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSKKNSLCVK FVKVFSR 3 FGEVAVATVI CKDKLQG I E 
VGHFMGGDVCI YTNVYK YVSWIENTAKDK 


6156 


SV25 


3984 


GTSTVTMATKKHFS1 1 1.NLLGMLLKKDNQDTRKLLMTWALEVAV 
VKKKSETyAFLFCLPSKHKFCKGLLADTbVEDVNICLOACSSI>H 
ALSSSLPDDLLQRCVDVCRV0LVHRGTC1RQAFGK1.LKS1PLGV 
FLSNNNHTE 3 OEI SLAl.RS HMSKAPSNTFHPQDFSD/ VI S FI LY 
GNSHRTGKDNWLERbFYf^CO^LDKRDQSTIPRNLLKTDAVLWOW 
AIWEAAQFTVLSKLRTr j-.GRAQDTFOTIEGIIRSLAGHTL.NPDC 
DVSOWTTADKBEGHGNNO1.RLVLLL0VLENLEKLMYNAYEGCAN 
ALTSPPKVIRTFLYTNROTCODWLTRIRLSIMRVGLLAGOPAVI 
VRMGFDLLTFMXTTSI^QGNELE^SIMMVVEALCELHCPEAIOG : 
I AVWSS S I VG KHLLW IN* VA00AEGRFEKASVEY0EH1 CAMTGV ; 
DCCISSFDKSVLTLASAGCKSASLKHCLNGESRKSVLSKPTDSS 
PEV3NY IXNKACECY3STADWAAV0EW0NAIHDLKKSTSSTSLN ! 
LKADFNYIKSLSSFESGKFVECTEOLELLPGENINLLAGGSKEK 
IDMKKLLRNK j 


6157 




3 25- 


MANRGPSVGLSREVOEK'j EQXYDADLENKLVDWI ILQCAEDIEH 
PPPGRAHF0KWLMDGT\'bCK^INSLYPPG0EP3 PKI SESXMAFK 
0ME0 ISOFLKAAETYGVn TTD I FQT\T)LWEGKDMAAVORTLMAL 
GSVAVTXDDGCYRGEPSK FHRKAQQNRRGFSEEQLRQGQNV I GL 
QMGSNKGASOAGMTG YGM PRC; I M* DAASCP 


6158 


441 

1 


14 82 


LGSL3VLSLHCKVIFSSOSLERAMKEKAVDLVPILAONPGLA0N 
P1LEGKDHN0NTGVDP1 j DHVQDRKTD/SRSKSPHKKRSKSRER 
RKSRSRSHSRDKRKDTKEK2KEKERVKEKDREKEREREKEREKE 
VPBnKNKnRnKPRFKnRF KDXEKDREREREKEHEKD^DKEKEKE 
QDKEKEREKDRSKE1DEKRKKDKKSRTPPRSYNASRHSRSSSRE 
RRkRRSRSSSRSPRTSKTJKRKSSRSPSPRSRNKKDKKREKERD 
H1SERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SS VSKEVDDKDAPRTEEK K3 OHNGtJCQLNEENLSTKTEAV 


6159 


53 

i 


84 

i 


AVIAPLHISI^DRARPYLKNTEKSSTTCSRRRNOSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPU'llWPVTLPVSDPVGSCV 
1 1 TGTP 1 liTFVKDPOLE VNF YTGMDEDSD I AFQFRLH FGKPAI M 
NSCVFGI WRYEEKCY YLF FEDGKPFELCIYVRHKE YKVMVNGQR 
I YWAHR FPPAS VKMLOV FRD I SLTRVLI SD*GRCVR I TAVOEF 
DVSVSCDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP* F* KVADAQPT ESEKEIYKQVNWLKDAEG I LEDLQ5 
YRGAGHEIREA3QHPADFKL0EKAWGA\n/PLVGKLKKFYEFS0R 
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5EQ 
ID 
NO: 


Predicted 
beginning 
nucleot i de 
3 ocat i on 
corresponding 
to first 
arr.ino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 ocat ion 
corresponding 
to first 
annuo acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^-Alanine, C-Cysteinc, D=Aspartic Acic, E= 
Glutamic Acid, ?-Phcny lal anine, G=Glycine. 
H=Histidine, I-lsolcucine, K«^ysine, 
L=Leucine, N»=Methionine, N=Asparagine. 
P^Proline, 0=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^-.Valine, 
W=Tryptophan, Y= Tyros me, X=Un)cnown, *«_stop 
Codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion) 








LEAAI.RGLLG ALTS TPYS PTQ1 i LEREQALAKQFAE IIjHF TLR FD 
ELKMTNPAI QNDFS YYRRTLS RMR 1 NNVPAEGENEVNNE LANRM 
SLFY AEATPT4LKTLSDATTKFVSENXNLP I ENTmCLSTMASVC 
RVMbETPEYRSRFTWEETVSKCLKVT4VGVIILyDP^HPVGAFAK 
TSK I DMKGCI KVLKDQP PNSVEGLLNALRYTTKHLNDETTSKQI 
KSMLQ*QLLTLVNKG 


6161 


455 


1S69 


PVSGSESSLRRAWAS I LRLMLGPRVAVSILCEDGISH* LLEKH* 
KSUVLEPLS SLALEEQCLALS LDWSTGKTGRAGDQPLK 1 1 SSDS 
TGOLHLIjMVNETRFRLOKVASWOAHOFEAWIAAFNYWHPEIVYS 
GGDDGLLRG W DTR V PGKFLFTS XRHTMGVCS 1 QSSPHREHI LAT 
GSYDEH ILLWDTRNM KQPLADTPVQGGVWR1 KWHPFHHH LLLAA 
CMHSGFKH/NC0KAMEERQEATVLT5HTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNLGTXTADLKGASELPTPCHECREDNDGEGH 
ARPQS GMK P LTE GMR KNG 1*WL0 AT AATTRDCG VN PEEAD S AFS L 
LATCSFYDHALHLWEWEGN 


^~Tl62 


1 


586 


RTIHATGRAGASPMHRL1 VWRLAEANKQHVRCOKCLEKGHWTYE 
CTG KRKYLHR PS R TAEL K KALK E KJENRLLLOQS I GETNVER XAX 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*RM*TKEE 
EKE I ELLHS YWTDGLXTLM 


6163 


1081 


785 


H I RSTTEGC AVRLH PTQNTG KAN J M I LLSVS LGRHWAFT Y K FFL j 
TPWFVFFFFFF»lRKE*VM0KNPMKSREDEKMEKLNNLHVORAD j 
MNRLI MNYLVTEGFKEAAEKFRMESG I EPSVDLETLDER 1 K 1 RE 
MILKGQIQEAJALINSLHPELLDTNRYLYFHLWQHLIELIROR 
ETT EAALE FAOT0LAEQG E E S R E C LT EM ERTLALLAFDS P E E S P F 
GDLLHTMQRQK VWSE VNOAVLDY EN RESTPKLAKLLKLLLW AON 
ELDOKKVKYPKMTDLSKGVIEEPK 


6164 


I 


406 


PCOSPGRSRKRODKbTGSLRRGGRCLKRG^GVGTILSNVLKKR 
SCI SRTAPRX LCTLEPG VDTKLKFTLE PSLGQNGFQQWY DALKA 
VARLSTGIPKEWRRKVWLTLADHYLHSIAIDWTKTMRFTFNERS 
NPDDDSMG I OI VKDLHRTGCS S YCGOE AEQDRVVLKRVLLA Y AR 
WNKTVGYCOGFNI LAAL1 LEVMEGNEGDALKIMl YLIDKVLPES 
Y FVNNLRALS VDWAV FRDLLRMKLP ELSQHLDTLQRTANKESGG 
G YE P P LTNVFTMQ W F LTL FATCLPNQT VLKI WDS V FFEGS E II L 
R VS LA I WAKLGE0 1 E CCETADE FYS TMGRLTQEMLENDLLQSH E 
LM0TVYSHAPFPFPOLAELREKYTYNITPFPATVKPTSV5GRHS 
KARDSDEENDPDDEDA WNAVGCLG P FSGFLAPELQKYQKQ I KE 
PNEEQSI>RSNNIAELSPGAIWSCRSEYHAAFNSMMMERKTTD1N 
AiKRQYSRI KKK0QO0VH0VY I RADKGPVTS I LPSQVNSS P VI N 
HLLLGKKMKMTNRAAKNAVIH I PGHTGGKIS PVPYEDLKTKLNS 
PWRTH I RVHKKNMPRTK SHPGCGDTVGLI DEONEASKTNGLGAA 
EAFPSGCTATAGREGSSPEGSTRRTIEGOSPEPVFGDADVDVSA 
VQAKLGALELNOFDAAAETELRVHPPCQRHCPEPPSAPEENKAT 

HFPOMSRSFSKPGGGNSGP*KMVFSSGTMliSROLPGYPQEYORN 
GGERFG 


6165 


90 


405 


PCCSPGRSRMRODKLTGSLRRGGRCLKROGGGVGTILSNVUOUI 
S CI S RTAPR LLCTLE PG VDTKL K FTLEPSLGQNGFQQW Y D ALKA 
VARLSTG I PKEWRRKVWLTLADHYLHS 3 AIDWDKTMRFTFNERS 
N PDDDSMGI 0 1 VKDLHRTGC5 S YCGQEAEQCRVVLKRVLLAYAR 
WNKTVG YCQG FN I LAAL I LEVMEGNEGDAI>K I MI YLI DK VLPES 
YFVr^RALSVDMAVFRI^LI^MKiPELSQHIJ^TLCRTANKESGG 
GYEPPL-mFTMOWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGE01 ECCETADE FYSTMGRLTQEMLENDLLOSHE 
UiOTVYSHAPFPFPOLAELREKYTYNITPFPATVKFTSVSGRHS 
KARDSDEENDPDDEDA WNAVGCLG PFSGFLAPELQKYQKOI KE 
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ID 

NO: 


Predicted 
beginning 
nucleot i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Preri: cted end 
nucleotide 
loco t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/w;no scic r-ectnent containing signal peptide 
(AuAlanine, OCysfceine, F=Asp£rtic Acid, E= 
Glutamic Acid, F- Phenyla ; onine . G=G2ycin<L, 
H-Kistidine , l = lsoleucine , K=lA'sine, 
L=Leucine, ^Methionine, N=Asparagine, 
P=Prolinc, 0 = Glut amine , R^Arginine, 
S^Serine, T^Tnreonine , V=Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSLRSNN1 J\EL$ P G A I NS CRS E YHAAFN S MMM KRMTTD I N 
ALKRQY SR1KK KOOQOVHQVY I RADKGPVTS 3 LPSQVNSS P V I N 1 
HIAJjGKKMKMTJJRAAKN AV I H I PGHTGGK 1 S P V P Y EDLKTKl^NS 
PWRTHIRVHKK3WPRTXSHPGCGDTVGLIDEQNEASXTNGU3AA 
EAFPSGCTATAGREGSS PEGSTRRT1 EGOSPEPVFGDADVDVSA 
VOAKLrGALELNOR DAAAETELR VH P PCQRHCPE P PS APEENKAT 
SKAPQGSNSKTPI FS PFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPQMSRS FSKPGGGNSGP* KM VFSSGTMLSROLPGYPOEVqk^ 

GGERFG 


6166 

6TTT 




1206 


kklkrtvamagaewksleeclekhlplpdlqevkrvlygkelrk 
ldlpreafeaaskedfelqgyafeaaeeolrrprivhvglvonr 
iplpakapvaepvsalhrri kaivevaamcgvn1 icfqeawtmp 
fafctreklpw7e f aesaedgpttr fcqklaknhdmwvs pi le 
rdsehgdvlwntawisnsgavlgktrknhiprvgdfnestyym 
egi^lghpvfotofgri avnicygrhxiplnwlmys ingaei i fnp 

SATIGALSESI/WPl EARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDC K KAHQtFG Y FYGSS YVAAPDSSRTPGLSR SR DGLLVAXLDL 
NLCQQVNDVWN FKMTGRY EMYARELAEAVKSN YS PTI VKE * PAS 
VPALG 


1220 


1844 


YGlVTGPSbCAGPKQFKKOEKNPVLVSPEFVDEALCACEEYLSN 
LAFIMDIDKDLEAPLYLTPEGWSLFLQRYYOWHEGAELRHLDTO 
VQRCED1 LOOLQAWPOl DMEGDRNI WIVKPGAK5RGRG1MCMD 
HLEEMLKLVNGNPWMKDGKWWQKYIERPLLIFGTKFDLROWF 
LVTDWNPLTVWFYRDSYI RFSTQPF5?LKNLDK*AP1,YLTPEGWS 
LFLORYY0WHEGAELRHLDTQV0RCEDIL00L0AWP01DMEG 
DRN 1 W I V KPG A K SK GRG 2 MCMDH LEE MLKLVNGN P WMK DG K WV 
VQKY1 ERFLL3 FGTKFDLRQWFIWTD WNPLTVWF YRDSY IRFST 
QPFSLKNLDK 


6168 


84 


13 92 


VWFVPSVSAMPFKKQA0AGGSKKAEOKKKEK3 I EDKTFGLKNKK 
GAKO0KFIKAVTHUVKFGOQNPRQVAOSEAEKKLKKDDKKKELQ 
ELNELFKPWAAQK J SKGADPKSWCAFFKQGQCTKGDKCKFSH 
DLTLERKCEKRSVY I DARDEELEKDTMDNWDEKICLE2WNKKHG 
EAEKKKPKTQ1VCKHFLEAIENNKYGWFWVCPGGGDICMYRHAL 
PPGFVLKKKKKKKKKEDE1SL*D1»1 ERERSALGPNVTKITLESF 
LAWKJCRKRQEKI DKLEODMERRKADFKAGKAiVI SGREVFEFRP 
ELVNDDDEEADDTR YrQGTGGDEVDDS VSVND 1 DLS LYI PR DVD 
ETG 1 TVASLERFST YTSDKDENK1*S EASGGRAENGERSDLEEDN 
EREGTENGA I DAV PVDENLFTGEDLDEI ,EE£ LNT1»DLEE 


6169 


112 


662 


APAAAWAESPEDLK LPNAV I TRI I K jEALPDG VN 1 S KE ARS A3 SR 
AASVFVLYATSCANNFAKKGW^KTLKASDVLSANEEMEF^RFVT 
PLKEAXEAYRREQKGKKEASBQKKKDKDKKTD5EE0DXSRDEDN 
DEDE ERLEEEEQN E E E E VDN ♦ KGRE7VA P WKV PLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVF1TGASRG! GKAI ALKAAKDGAN 3 VIA 
AKTAO PHP KLLGTI YTAAEE I EAVGGFJVLPCI VDVRDEQQI SAA 
VEKA3 KKFGG 1 D J LVMNASAI SLTNTLDTPTKRLDLMMNVNTRG 
TYLASKACIPYLKKSKVAH1PNISPFLNLNPVWFK0HCGRM*W 
G * GDGLCL1 CFELNLCMSDV ITICT 


6171 


382 


941 


HFMOSDVELDCDIEPCGHTXFPPTLPLSTTVIVCSCHPVATAST 
MAEAFS KTTSEE DOS IOEPKEANSMTAQK0KK* GLRGSR RRHAN 
SGGD1 rcDSFAAYFPRVLKQVHQALSLSOEAVSVMDSrmDI LD 
RIATEAGHLAHYSKCVTITSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHR FSRTHVEAALKMLRRF. AR1>RRE YLYR KAREEAQR 
SAQERKERLRRALEENRLIPTBLRREALAXOGSLEFDDAGGEGV 
TSHV^DEYTiWAGVEDPKVMITTSRDFSSRLKMFAKELKLVFFCA 
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BNSDOCID <WO 015331?A1J_> 
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SEC 
ID 
NO: 


Predi ct t>c 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predict ed end 
nucl eot ide 
location 
cor responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino 6cic seoment containing signal pepUde 
(As- A I ar. i nc , C^Cysteine, D=Aspartic Acid, E_ 
GluteTTii c Acid, F=Phenylalenine, G=Glycine, 
H=Hist:dme, I « I scleucine , K-Lysine, 
I.=Leucine, M=Methionine, N = Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S = Serme, T= Threonine, V«=Valine, 
W=Trypt ophan , Y*Tyrosine, X = Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRMNRGKHEVGALVRACKANGVTDLLWKEHRGTPVGLIVSHLP 
FGPTAYrThCWVMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 
SD I LR YL F P VPKDDS HRV I TFANQDDY I S FRHHVY KKTDKRNVE 
LTEVGPRFELK1 YMlRLGTliEQEATAD\ r EWRWHPYTNTA.RKR.VF 
LSTE*AAPRPLGQL>L 


6173 


i 


286 

i 


SVDHREV0VLSQ5.MPLTPH0AVLRGERPYMCVECGKCFGRSSHL 
LQHQR1 HTGEKPYVCSVCGKAFSOSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDI^AOHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R3HTGERFYVCPLCGKAFN)1STVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQH^RIHTGEKPYTCSECGKAFSDRSVLIOKHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHLI0HOKVHRKL*PTCVLSVGSALAGVPTS?SISVSTI J ERSP 
MCAVYVGRPSARAQSLVNTt5QFTQVRSPMSVMSVEKPLE 


6 174 


1QSC 


959 


PRPPGKRWWVAGLGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAAnLALAPIjGDAOLVLLRPRRLKNANGRSVARAAELPGL 
TAEEVYrVMDELDKPLGRIiALKLGGSARGHNGVRSClSCLNSNA 
KPRLRVG3GRPAHPEAVQAHVLGCFSPAE0ELLPLLLDRATDLI 
LDHIRERSOGPSI,GP*H + WFSKKA 


6175 


2204 


334 


RYFRAI)Pr<55RSGOPRAEGL<?AFAEGPLRAKAAPVKGNRKOSTEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEECEREEEQ 

jutt.vqt vvrMifPDUTDT p*D\/DUT/t wot wi wv t v v nvPYi r.avr 
f\r Jj vol, i t\ r flNf.Kn x tr 1 £/K Vr'.HLAjr .'svir'ljr* m z w^vtAbbHl r. 

LVTGRRLKKNVY.NELGGSPGSTSGATCTRRHY • RL-VLPYVRHLK 

GEDDKPLFTSKPRKOYKMAKENRGDDGATERPKKAKEERRMDQM 

MPGKrXAEAADPAPLPSQEPPRNSTEOOGLASGSSVSFVGASGC 

PEAYKRLLSSFYCKGTHGJMSPLAK1CKLLA0VSKVEAL0C0EEG 

GRHGAFPrASPZvVHT,PF^POQPK01.TFN^nHRl.TPOFfi]»OAPfifi 

SLREEAQAGFCPAAP I FKGCFYTHPTEVLKPVSQHPRDFr SRLK 

DGVLLGFPGKEGLSVKEPQLVWGGDANRPSAFHKGGSRKGILYP 

KPKACWVSPMAKVPAESPTIiPPTFPSSPGIX5SKRSLEEEGAAHS 

GKRLRAVEPFLKEADAKKCGAKPAGSGLVSCLLGPALGPVPPEA 

YRGTM LH C P I.N FTGT PG PLKGOAALP FS PLV I PAFPAH FLATAG 

PSPMAAGLMHFPPTSFDSALSHRLCPASSAWHAPPVTTYAAPHF 

FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIG01IGASGFSESSLFCKWGIHTGAAWKLLS 
GVREG0T0VDTPQ1GDMAYWSHPIDLHFATKGL0GWPRLHF0VW 
S0DSFGRC0LAGYGFCHVPSSPGTH0LACPTKRPLGSWRE01AR 
AFVGGGPC'LLHGDTI YSGADRYRLHTAAGGTVHLEIGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGOEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAPYCCEKGKROPHKSLHDRCFGFJU.DPN 
CSHCYLDCIKRSDFLGFSGYSPHFVAJSTNSEHKMOPSSMQ0AL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTAALAHGCl. 
HCHSKFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLAJPAKJTREKLD0VATAVY0MMDQLYOGKMYFPGYFPNELR 
NIFRE0VHLI0NAI I ESR 1 DCQHRCG I FQ YETI S CNNCTDSHVA 
CFGYNCESSAOWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THPJVAPAFLVLPALRCLEPPHLANIjSLEDAA*CLKQH 


6179 


806 


276 


rgetremagnllsgagrrlwdwvplacrsfslgvprligirltl 
pppkwdrv7nekra/4fgvydni gl lgnfexhpkelirgpj wlrg 
wkgnelorcirkrkmvgsrmfaddlhnlnkrirylykhfnrhgk 
fr * icr k lr tsekahls pwrretvlfp vrkr lci fs v 1 kwgffg i 


6180 


1S6 


1833 


DHHILKAASTTHVCARGNIFAI PNTRCXEC*ATATPSSLECQN* 
S H LS LC P L P ATTSGLTPNS M 1 P EKERQN I AER LLR VM CADLG AL 
S WSGKE FLKLAOTLVDSGAR YGAFS VTE ILGNFNTIALKHLPR 
MYNOVKVJCVTCALGSNACXG1 GVTCHSOSVGPDSCYI LTAYQAE 
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SEQ 
ID 
NO: 


Predi et ed 
bee inning 
nucj cot ide 
locat Lor. 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ion 
correspondinc 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secnent containing signal peptide 
(A^Aiarune, C^Cyoteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl al aniuc , G=Glycine, 
H=Histidine, 1 = Isoleucine , K^Lysint, 
L=Leucme, M=Me thionine , N=Asparagine, 
P=Projine, Q=Glutamine, R=Arainine. 
S=Serine. T^Threonine, V^Valme, 
W= Tryptophan, Y^Tyroeine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNH1 KSYVLGVKGADIRDSGDLVHHWVOK'VLSEFVMSEIRTVY V " 

TDCRVSTSAFSKAGMCLRCSACALNSWOSVLSKRTLQARSMHE 

VIELLKVCEDLAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 

KERYEQI CEFYSRAKKMNLIQSLNKHLLSNLAAILTPVK0AV2 E 

ISNiSOPTLOLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 

KENFKVHPAHKVAMlLDPQQKLRPVPPYCKEErlGKVCELINEV 

KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 

PLF0ATPDLFQYWSCVTOKHTKLAKLAFWL1AVPAVGARSGCVN 

MCE0AIL1KRRRLLSPEDMKKLMFLKSNHL 


6101 


169 


1032 


TRTLhSPVLLPGPRWKPWRRRPMGPLALPAWLOPRYRKNAYLFI 
YYLIOFCGHSWIFTNMTVRFFSFGKDSMVDTFyAlGLVMRLCQS 
VSLLELLHI YVGIESNHLLPRFLQLTER1 1 ILFW1TSQEEVQE 
KY WCVLFVFWNLbDMVRYTYSMLSVI G J S YAVLTWLSQTLWMP 

T YPIiPl/r tPAP/VTVO^ t.PYFF<^Pf^TV^TKl.P"7nT.Q7 VVDVUI VH 
1 lr U\- v K-t-ALsr\r n -I ]yo Utr it Loru 1 J j 1 r-br r UUO i 1 r i J V L>I\ A 

YLMMLF1 GMYFTYSKLYSERRDI LG3FP3 KKKKM*STAFQCCTR 
KDRLW1CCSK*NTGS J LVEKFLVF 


6182 


1765 


1224 


AS* 1 DYQLNTLLKEFQLTEENTKLRYLTCSLI EDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLt^EFQ 
VKNVPSERIATQKILSVLGECLDIIFGPGCVGVQKILNARCPLVR 

CWARA^S LTS S I PGAW 1 TNFSLTMMV1 F F LORRSPP I LPTLDSL 
KTLADAEUKCVIEGNNCTFVRnLSRIKPSQNTETLELLLKEFFE 
YPX3NFAFDKNSINIRQGREQNKPDSSPLY 1QNPFETSLNISKNV 
SOSObQK FVDLARES AW I L^QQEDTDR PS 1 £ SNRPWGLVS LLLPS 
APNRKSFTKKKSNKFA1ETVKKLLESLKGKR7ENFTKTSGKRTI 
STOT 


*>1 8 3 


1118 


452 


HLDRY1KSPGSGSSTPAPPSHLLLYLLHP0STRTMGCCGCSRGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSOSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 

CCSS CCK PCCCQS NCCVP V CCQCX J ♦ GSG F R PSGFS CL VXAFLM 
VP 


6184 


1 


2191 


I VT VR E EDGAPAVAP PGWVS RAN K R SGAG PGGSGGGG ARG AEE 
EPPPPLQAVLVADSFDRRFFPISKDOPRVILPLANVALIDYTLE 
FLTATG VQETFVFCCWKAAC- 1 KEHLLKSKWCR PTS LNWR I ITS 
ELY RS LGDVLRDVDAKALVRS DFLLVYGD V I SNIN I TRALEEHR 
LRRKL* KNVSVMTMI FKESSPSHPTRCHEDNVWAVDSTTNRVL 
H FO K TOG liRR FA F PbS LFOGS S DG V E VR Y DLLDCH I S I CS P Q VA 
0LFTDNFDYQTRDDFVRGL1.VNEEI LGNQI HMHVTAKEYGAR VS 
NLHMYSAVCADV3 RRWVYPLTPEANFTDSl-TOSCTHSRHNI YRG 
PEVSLGHGS1LEENVLLGSGTVIGSNCFITNSV1GPGCH1EPGD 
NWLDQTYLWGGVR VAAGAQI HOSLLCDNAEVKER VTLKPR S Vl> 
TS0VWGPN1TLPEGSVJSLHPPDAEEDEDDGEFSDDSGADOEK 
DKVKM KG YNPAEVGAAGKG YLWKAAGMNME EEE ELQQNLWGLKI 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQN2VLGTLQR 
G KEEN 1 S CDNLVLE I NS LK YA YU I S LKE VKQVXSHWLEFP LQQ 
MDSPLDSSRYCAtLLPLLKAWSPVFRJCyiKRAADHLEALAAlED 
FFLEHEAbGl SMAKVLMAFYQLEl LAEET1 LSMFSOROTTDKGQ 
QLRKNQQLQRFIQWLKEAEEESSEDD 


618S 


791 


44 


PCTS CVLWATLHLPA5 rRKAPQAECGM I S I TEWQK 1 GVG J TG FG 
I FFI LFG TLLYFDS VLLAFGNLLFLTGLS L 1 1 GLR KTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRR LQGTSSMV* KTEtfSSLN LDHWLKGAK 
REEWEPPPQS PALTHS PTYPGPPOVOKERNGAEQLTSNPQVDSR 
GCQEAEM0TPRRLGWGWYHTL7LYLWEEK 


6186 


B69 


238 


V YG I DS SNTNTHGAE ERNRKJLK KH WK LCHAQS RLDVNGLALKMA 
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S£0 
ID 
NO: 


Fredi ctec" 
beginning 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue ot 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing :-;anal peptide 

Alanine, C^Cysteine, D=Ap.portic Acid, E- 
G]utr;tnic Acid, F^Phenylalani ne , G=G1 ycine , 
H--Histidinc, I=Isoleucine, K = Lysint. 
l.= Leuci ne, M-Methionine , N^Aspf.rag int , 
PsProlme, Q=Glutamine, R-Argir.ine , 
Ss-Serine, T- Threonine, V=valint, 
VU Tryptophan , Y=Tyrosine, X=Unknovn, *~£top 
Codon, /sponsible nucleotide deletion, 
V possible nucleotide insertion) 








KER KVKN KV KMKADTEEV FNNS PTN0EKMP7SAI DrSGSV I S 
NJ.RNOMETLHSQPH0EENLCFENSFSL1NL1.PINAVEPTSSO0I 
PNHETSEANKERRICMTSKSSESNI YSPLTSF I TADS ELHDJ I KD 
LEDCLMVGLHTCGDLftPNTLRI FTSNSE1 KGVCSVGCCYHLLSE 
EFENOHKKKTOEKWGFPMCHYLKEERWCCGRN'ARMSACLALERV 
AAG0GLPTES LFYRAVLQD1 1 KDCYGI TKCDHHVG K I YSKCSSF 
LDTVKRSLKKLGLDESKLPEKl IH^YEKYKPRMKELFAFKMLX 
VVLAPCIETLILLDRLCYTjKEQEDIAWSALVKLFDPVXSPRCYA 

v:alkkoo*fplkqiirci£l*dsagcaeevsvgdggpalrdap 
psgsrvgsryd 


6187 


1703 


771 


PA WG PETR LARI LN PDS FI E PRPGRLPELF ATRPHMt: PKASCPA 
AAPLMER K FHVLVG VTGS VAALKLPLLVSK LLDI PGLEVAWT7 
ERAKHFY S PQDIPVTLYSDADEWEMWKSRSDPV^tt 1 DLRRWADL 
LLVAPLDA2JTLGKVASG1 CDKLLTCVMRAV/DRS KPLLFCPAMNT 
AMWFHPI TAOOVnOLiKAFKYVEI PCVAKKLVCGDEGLGAMAEVG 
Tl VDKVKEVLFQHSGFOQS* PGISVMGVPLYSEWVQAKSVKNDV 
GK : GGYPH LLNGGPALS LPRGQACSRLNWTEGPGLS FFQPGEAA 
A 


6188 


236 


1534 


KGFVNJAGP1MAELOVSPO>*KAPEMS01CLSCGHPSA-»GPRWA£M 
NlGVFICIRCAGlHRNLGVHISRVKSVNl>D0KT0EO30CMQEMG 
NGKAMRLYEAYLPETFRRPQJDPAVEGF1RDKYEKXKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKbEPWFEKVKMPOKKEDPQLP 
R KS S ?KS TAPVMDLLGLDAPVA CS I ANS KTSNTLEKELDLLAS V 
PSPSSSGS R KWGSMPTAGS AGS VPSNLNLFPE PGSK S EE1GKK 
01»SKDSlLSliYGSQTPQMPlX)AMFMAPAOWAVPTAYFSFPGVTP 
PNS J MG5MMPPPVGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTC^AGYMAGMAAMPQTVYGV0PA0QLQWN1,T0MTCX3M 
AG MM FYGANGMMNYGQS MSGGNEQAANCTLS P0MVJ K 


6189 


1297 


793 


LG EPLGDLCELI PGDVQQLOMG E VH PGTGAOG S AAQS VAGS VQL 
TQLSHAR0RPSC0GS0MALDLOHMDI SR0PK WOHVOPVARQVQ 
RAQQAQLAEGVAVIILW AGDAWAE VELLOEVGGG KVF AAN ACDL 
VVCDHEGAKAARQATGHAbQRVIVOVRRVCPLEAl>RVPSGLPR 
RVRAFM I LHNQI TG I GR EDFATTY FLSELNLF Y MR ITS PQVHRP 
AFRKLRLLRSUDt.SGNRLHMLPPGLPRNVlWLK^KRNElAALAR 
GAI^G MACLRELYLTSNR LRS RALGPRAWVDL/JILOL I.DI AGN0 
LTE 1 PEGLPESLEYLYL0NNK1SAVPANAFDSTPNLKG 1 FIjRFN 
KLAVG S WDSAFKRLKHLQVLDI EGN L»EFGD I S KDRGRLGKEKE 
EEEEDEVEEEETR 


6190 


66 


1309 


ILVGNVSFLLSFAEYVCNCSWGSLNVNRCNCTTGOCECRPGYQ 
GLKCETCKEGF YZ.N YTSGLCQPCDCS PHGALS J PCNSSGKCQCK 
VGV 1GSI CDRCQDG YTG FS KNGCL P CQCNNR £ AS CD ALTG ACLN 
CO/ENS KGNHCEECKEGFYQSPPATKECLRCPCS AVTSTGSCS I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDS 1 CRKCCCHGHVY 
PVKTPKlCKPESGECINCLHNTTGFViCENCL*GYVHDLEGNClK 
KV I L P TPEGS TI L VSNAS liTTS VPTP V I NS T FT P TTLCT I FS VS 
TSENSTSALADVSWTOFNI 1 1 LTV 1 1 IVWLLMGFVGAVYMYRE 
YQN R K LNA P PWT I ELKEDN I S FSS YHDS I PNAD V SG LLEDDGN E 
VAPNGOLTLTTPIHNYKA 


6191 


1212 


1S11 


VNLCHGGLLHLSTHHLG1 KPSMH* LFFLMLSFPHLTPQQPKCPS 
M1DWIKKIWY1YTMEYYATIKRNEIMFFAGTWKEMEAIILSKLM 
QDYMFSLISGS 


6192" 


3 


950 


TRGCGNKMAGKKNVLSSIAVY AEDSEPESDGEAG I EAVGSAAEE 
XGGLVS DA Y G EDDFS RLGGDEDG YEE E EDENS R OS EDDDS ETEK 
PEADDPKDNTEAEKRDPOELVASFSBRVRNMSPDEI KI PPEPPG 
RCSNHL0DKIQKLYERK1 KEGMDMNY 1 1 QRKXE FRU PS1 YEKbl 
0FCAI DELGTKYPKDMFDPHGWSEDSYYEAliAKAQK I EMDKliEK 
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NO: 


Prcdictec 
beginning 
nucleot ide 
locot ion 
corresponding 
to first 
omino acid 
residue of 
amino acju 
sequence 


Fi edicted enc 
nucleot ide 
location 
ccr respond inc 
to first 
aiT.ino acid 
residue of 
amino acid 


Arnino acid seotnent c ontaining signc _ peptide 
{A-Alsni«e, C-Cystc ne, D=Aspartic Acid, E= 
Glutsmic Acid f F-P'.c nylnl anine, G=C- ^ycine, 
H=Hist idinc, I-Iscl »-ucine , K=Lysint, 
I»=l_.eucine, K-Methicr.2ne, N=As:paragi r.e , 
PRproline, Q=Glutarn: ne, R-Arginine, 
S^Serine, T=Threom , V^Valine, 
tf= Tryptophan, y=Tyrcrine. X^Unknovn, *=Stop 

Cni^nn / — r"ifit» r i Kl p r - j r - "1 *» O f" •! r1«> rf 1 of -i /-, r. 

\=possible nucleotide insertion) 








AKXERTKI EFVTGTKKGTT'. NATS777TTASTAVALAQKRKSKW 
DSAIPVTTIAOPTJLTTTA: LPAWTVTTSASGSK1 TVISAVGT 
1VKKAKQ 


6193 


3 


250 


TRGCGNKMAGKKNVLSSl^vyAEDfEPESDGEAGlEAVGSAAEE 
KGGLVSDAYGEDDFSRLGG!. V LDGYF ^EEDENSRQSEDDDS2TEK 
P E ADDPKDNTEAEKRDPQE I V AS F S ERVRKMS PDE2K1PPEPPG 

ALbJNrtJLVJJAlyjtLytiKKl A^.«w'!*JWviv Y 1 iyhAAtf AM'ii l£.K-L»I 

0FCA1DELG?NYPKDKFD?:-X- WSEDS YYEALAKAOK 1 EMDKLEK 
AKKERTKI EFVTGTKKGTTTKATSTTTTTASTAVADAUKRKSKW 
DSAI PVTTIAOPTILTTTATLFAWTVTTSASGSX'JTVI SAVGT 
IVKKAKQ 


6194 


3 


95G 


TRGCGNKMAGKKNVLSSU-.VYAEDSEPESDGEAGI EAVGSAAEE 
KGGLVSDAYGEDDFSRl>GGr;EDGYEEEEDEKSRCSEDDDSETEK 
PEADDPKDNTEAEKRDPQE 1..VAS FSER VRW.S PPE 1 Kl PPEPPG 
RCSNlILOE>K10XLYERKIXi:GMDMNYI10RKKEFRNPSIYEKLI 

Ofcaidelgtnypkdmfdphgw£edsyyealaxaok:emdklex 
akkertki efvtgtkkgtttnatstttttastavadaokrkskw 

DSAI PVTTIAQPTI LTTTATl PAWTVTTSASGSXTTVI SAVGT 
IVKKAKQ 


6195 


736 


236 


VANGLQSNMPKFYCDYCPTV^'J'HDSPSVRKTKCSGRKHKENVKD 
YYQKWMEEQAQSLIDKTTA^.?0QGK1 PPTPFSAPPPAGAMI PPP 
PSLPGPPRPGMM?APHMGG>\~MPKMGPPPPGMMPVGPAPCMRP 
PWGHMP^PGPPMMRPPA^^MMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILJDNAEOVlSNLEARNbGPRLTPLIjOEEDSH 
0RLL^X5LMVSELKDHFLRHL0GVEKKKIE0MVLDYISKLLD^ 1 IC 
HIVETNWRKHNLHSWVLHFK 4 . aGSAAEFAVFHIMTR 1 LEATNSL 
FLPLPPGFHTLHTILGVOC1 ,7 LHNLLHCI DSGVLLLTETAVI RL 
MKDLDNTEKNEKLKFSIIVRLPPLIG0K1CRLWDHPMSSNIISR 
MHVTRLLOMYKKOPRNSKJNKSSFSVEFLPLNYFIE 2 LTD1ESS 
NOALY PFEGHDNVDAEFVEE; .ALKHTAMLLGL 


6197 


3 


815 


ADPEGTESAVMSRYTRPPW'iibFlRNVADATRPEDLKREFGRYG 
P I VDVY I PLDFY7R R PRG FA V VQFEDVRDAEDALYN ; .NRKWVCG 
ROIEIOEAOGDRKTPGQMKHVERHPCSPSDHRRSRSPSQRRTRS 
RSSSWGRURRRSDSLKESRKKRFSYSOSKSRSKSLPKRSTSARO 
SRTPRRN?GSRGRSRSKSLQKRSKS3GKSQSSSP0KCTSSGTKS 
RSHGRHSDS I ARSPCKSPKG V TN FETKVQTAKHSHFR SHSRSRS 
YRHKNSW 


6198 


111 


1912 


S EAALSPS FI SPACFLbR KL F ALEDGTLPHPDTLGMN YEGARSE 
RENHAJwDSEGGALDMCCSE V*. . .P0»L»PUP A VMEAJjDc Aea»LajDS%j 
REMPPPPP PSPPSDPAQKP F J-'RGAGSHS LTVRS S LCL FAASQFL 
LACGVLWFSGYGHI WSQNATN LVS S LLTLLKQLEPTA WLDSGTW 
GVPSLHiVFLSGGLVI>VTTLWHLLRTP PEPPTPLP P EDRRQSV 
SRQPSFTYSEMMBEKI EDDFLDLDPVFETPVFDCVMDI KPEADP 
TSLTVKSMGLOERRGSNVSL'J 1 DMCTPGCNEEGFGYLMS PREES 
ARE YLiLS ASRVLQAE E LH EK? . 1/OP FLLQAEFFEI PKN FVDPKEY 
DI PGLVRKNRYKTI LPNPH5> VCLTSPDPDDPLSSY 1 KAMYIRG 
YGGEEKVY 1 ATQGPI VSTVAT'FWRMVWOEHTPI I VK3 TNI EEMN 
EKCTEYMPEEQVAYDGVE ITVO-KV I HTEDYRLRLI SLK SGTEER 
GLKHrWFTSWPDQKTPDRAPrLLHLVREVEEAAOQEC-PHCAPII 
VHCSAGIGRTGCFIA7SI CCC CLRQEGWDILKTTCObRQDRGG 
MIOHCEQYQFVHHVMSLYEKC LSHQSPE 


6129 


144 


1211 


MARENGESSSSWKKOAEDI KK 2 F EFKETLGTGAKSEVVLAEEKA 
TGKiFAVKCI PKKALKGKEFF J ENEI AVLRKIKHENI VALEDIY 
ESPNHLYLVMQLVSGGELFDK Z VEKGFYTEKDASTL.3 RQVLDAV 
YYLHRKG I VHRDLKPENLLYY £ QDEES KIMISDFGLE KMEGKGD 
VMS TA CGTPG YVAPE\1iA0K? Y S KA VDCWS I GVIAY1 LLCGYPP 



464 



BNSOOCID. <WO 0153312At J_> 



WO 01/533 J 2 



PCT/USOO/34263 



SEO 

NO: 

1 

U - 


Predict c-c 
beg inn a nc 
nuclect ice 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
j oca t: on 
cor responding 
to first 
amine acid 
resicue of 
amine £cid 
sequence 


Amino ar:c seotr>ent conteming signal peptide 
(A«Aianine, C=Cysteine, D=Atpartic Acid, L- 
Glutamic Acid, F-Phenylalani ne, GsGlycint . 
H-Histidine, 3>3soleucine , r.-Lysine, 
L=Leucine, M= Methionine, I^Asparagine, 
P=Proline, 0=Glut amine, R-Arginine. 
S=Serine, T-Threanine , V-Valine, 
WcTryptophan, Y=Tyrosine, X=Unknown, **£l.op 
Codon, /^possible nucleotide deletion, 
\sposeible nucleotide insertion) 








FYDENESKLFEQILKAEYEFDSPYKPD: SDSAKDFIRNLKfcKDP 
NKRYTCEOAARHPWIAGDTALHKNIJ-ESVSAQIRKNFAKSKWRO 
AFNATAVVRHKRKLH LGSSLDSSN A S VSSSLSLASQXDCASG? F 
HAL* 


6200 . 


'7 0/ 


56 


L P E V PK S LR P R V K PK L CCAQ P A VR V KAK LP KLA V F DLD YT 1, VI P F 
WVDTHVDPPFHXSSDGTVRDRRGQDVkb? PEVPEVLXRLQH LGV 
PGAAAS RTS El EGANQLLELFDLFR YFVHREI YFGSKI THFESL 
QQKTG1 PFSQMI FFDDERRNI VDVSKLGVTCIHIONGMNLOTLS 
OGLETFAKAOTGPLRSSLEES PFKA 


6201 


260S 


2383 


GQTPRVRWKMKRSLRAGKRRQTAGRKSKSPPKVPIVIQDD.9LPA 
GPPPOlRlLKRPTSNGWSSPNSTSKr-TLPVKSLAOREAEVAEA 
RKRILGSASPEEEQEKPILDRPTRISC'PEDSRQPNNVIRQFLGP 
DGSQGFKO-RR 


6202 






INADRAAVASSLLSRPTRKMAPOKDRKPKRSTWRFNLDLTKPVE 
DG I FDSGNFEOFLREKVKVKGKTGNLC-K VVH T F.R FKNKI TW£ E 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDESF.SED 


6 203 


All 1 


2550 


RCPRPFATAGAAASRPDRSPPSGISGSEAAAGAGAAAPASOHPA 
TGTGAVQTE AMKQI LGVI DKKLRNLEKK KGKbDDYQEKMNKG ER 
LNQDQLDAVSKY0EVTNNLEFAKELOREFKALS0DIOKTIKKTA 
RRE0LMREEAE0KRLKTVLELQYVLDKLGDDEVRTDLKOGLNGV 
P IhS E EELS LLDEFY K LVDPE ROMS LRLNKQY ERAS 1 H LIOLLE 

gkekpvcgttykvlkeivervfqsnyft>5:thnhonglcee?:eaa 
sapaved0vpeaefepaeeyte0seves7eyvnrqfmaet0fts 
g ekeqvee wtvetve wns lqqqpc aas fs vpe phslt p vaoad 
plvrrorvodemaonogpynfiqdsmldfenqtldpaivsaopm 
nptqnmpmpqlvcppvhsesrlaopnpv pvqpeatqv plvs sts 

EG YTASQPLYQPSHATEQR PQKE P1DQ1 QAT1 SLNTDQTTAS ££ 
LPAASOPQVFOAGTSKPLHSSGINVNA^PFOSMOTVFKMKAFVP 
PVNEPETLKOONOYOASYNOSFSSOPHOVEOTELQOEO^OTWG 
TYHGSPD0SH0VTGNHQQPPQQNTGFPRSN0PYYN5RGVSRGGS 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RDYSGYORDGYQ0KFKRGSGOSGPRGAPRGRGGPPRPNRGMPOM 
NTOQVN 


6204 


293; 


78-7 


CTHNLI SLLGGRALIHFNRFLNLKIQEGEAHN J FCPAYDCFOLV 
PGD1 1 KSWS KEMDKRYLQFDI KAFV ENNPA1 KWCPTPGCDRAV 
RLTKOGSNTSGSDTLSFPJbLRAPAVDCGKGHLFCWECIXSEAHEP 
CDCQTWKNWLOK I TEMKPEELVGVS EAY F. DAANCLWLLTNS X PC 
ANCKS P 1 QKN EG CNHMQC AXCKYDF CW 3 CLEE W KKHS FVK V1 E V I 
Y RCTR YE VI QHVEEQSKEMTVEAEKKK jcr FQELDRFMHYYTR FK 
NHEHSYGLEORLLKTAKEKME0LSRALKETEGGCPDTTFIEDAV 
HVLLKTRRILKCSYPYGFFLEPKSTKKEIFELMQTDLEKVTEDL 
AQKVNRPYLRTPRHXI IKAACbVCXJKROEFLASVARGVAPADS? 

LUSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
S L»RDYT PAS R S ENQDS LCALS S LDEDDPN I LLA I QLS LQES G LA 
LDEETRDFlrSNEASLGAlGTSLPSRLDSVPRNTDSPRAALfSEE 
LLELGDS 1MRLGAEN DP FSTDTLS SK F LS EARSDFCP S SSfcPDS 
AGQDPNI NDNLLGNI MAWFHDMNPQS IAL I PPATTEI SADSQLP 
CIKDGSEGVKDVELVLPEDSMFEDA5VSEGRGTQ1EENPLEENI 
PGGGKQHPQAW 


6205 


1 


1200 


RAiiRGKMALEVGDMEDGQLSDSDSDN.TVA PSDR PLQLPKVLGG D 
S AMRA FQN 1 TATACAP VS HYRA VES VES S EESFSDS DDDS CLWKR 
KRQKCFNPPPKPEPFOFGQSSQKPPVAGGKKINNlVfGAVLCEON 
ODAVATELGILGMEGTrDRSRQSETYNYLLAKKXRKESOEHTKD 
LDKELDEy/IHGGKKMGSKEEENGQGHLKRKRPVKDRLGNRPFMN 



465 



BNSDCOD: <WO 01533t2AlJ. : 



WO 01/53312 



PCT/l)SMV34:(>3 



NO: 


Predict ec 
beginnino 
nucleot id« 
location 
cor respondi :ig 
to first 
amino aciri 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
iocstion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ajnino aoo segment containing sionai peptide 
(A=AIanine, C=Cysteine, D=Aspartic Aeic, E= 
Glutamic Acid, F=Phenylalanine, G=Glyc^ ne , 
H^Kistidine, I- Isoleucine , K=Lysanfc, 
L=Leucine, M=Methionine, N=Asparaqine, 
P=Proline, Q-Glutamine, R^Arginine, 
S=Serme / T=Threonine, V»Valine, 
W=Txyptopban, Y-Tyroeine, X=Unknown, *=Stop 
Codon, /= possible nucleotide deletion, 
\=pcssibie nucleotide insertion) 








yKGRYEJTAECS0EKVADBlSFRbOEPXKDLlAJ?VVRllGNKKA 
1 ELLMETAEVEONGGLFl MNGSRRRTPGGVFXNLLKNT?S 2 SEE 
01 KDI FYI F.NQKEVENKKAARKRRTQVLGKKMK0A1KSLN FQED 
DDTSKETFASDTNEALASLDESQEGHAEAKLEAEEAIEVDHSKD 
LDIF 


6206 


10 


1442 


1 1 SERRERSCLKbVCI RCSCDWEMGSVLGLCSMASW 1PCLCGS 
APCLLCRCCPSGNKSTVTRIilYALFLLVGVCVACVMLIPGKEEO 
LNK1PGFCEKEKGWPCN1LVGYKAVYRLCFGLAMFYLLLSLLM 
I KVKSS SDPRAAVHNGFWFFKFAAAJ AI 1 IGAFF3 PEGTFTTVW 
FY VGMAGA FCF1 hi QLVLL1 DFAHSKNESWVEKMEEGNSRCtfYA 
ALLSATALNYLLSLVAI VLFFVYYTHPASCSEKKAFISVNMLLC 
VGASVMSILPKIQESOPRSGLLOSSVITVYTMYLTWSAMTNEPE 
TN CN PS hhS 11 G YNTTSTVPKEGOSVQWWHAQG 1 3 GLI LF LLCV 
FYSS1RTSNNSCVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 

rCr\ V Ulv £. K Lf\y VI IMof r/lf rib" liHoJj I J WJ 1 L>i iNWlJtjtrcKtn 

KSQWTAVWVX1 SSSW IG1 VLYVWTbVAPLVLTNRDFD 


6207 


2924 


14/1 

! 


7VMAEAATPGTTATTSGAGAAAATAAAASPTPI PTVTAPSLGAG 
GGGGGSDGSGGGKTXQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFORGYClYGDRCRYEHSXPLKQEEATATEIvTTKSSLAA 
SSSLSS1VGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 
G0PYCGRTAPSCTEAPLOGSVTKE2SEKE0TAVETKKQLCPYAA 
VG E C R YGEN CVYLHGDS COM CGl>QVLiH PMDAAQRSQ H I KS C 3 E A 
HEKDMELSFAVORSKDMVCGICMEWYEKANPSERRFGILSNCN 
HTYCLKCIRKWRSAXQFESKI IKSCPECRITSNFVI PSEYWVEE 
KEEKOKL1LKYKEANSNKACRYFDEGRGSCPFGGNCFYXKAYPD 

LjKfit- r"- t\Jri\/ A V\j 1 ^> J K l KH.\Jt\KNn r riCjLil. CC.KE.rvo J*.*' r Ul*UL.LsIL, 

WTF31X3EMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6208 


2924 


1471 


TVMAEAATPG1TATTSGAGAAAATAAAASPTPIPTVTAPSU3AG 
GGGGGSDGSGGGWTKQVTCRYFMRGVCKEGDNCRYSHDLSDSPY 
SWCKYFORGYCIYGDRCRYEHSKPLKQEEATATELTTKSFUIA 
SSSLSSIVG PLVEMNTGEAESRNSNFATVGAGSEDWVNA2 EFVP 
GOP Y CGRTA PS CTEAPI.OGSVTKEE SEKEQTAVETKKOLCP YAA 
VG ECR YGEN C VYLHGDS CDMCGLQVLHPMDAAQR SOH 3KSC3EA 
HE KDM ELS FAVORS KDMVCGICMEWYEXANPSERRFG 1 LS NCK 
KTYCLKCIRKWRSAKQFESKI I KSCPECK1TSNFVI PSEYWVEE 
KEEKOKLI h K Y KEAMS NKACRYFDEGRGSCPFGGNCFY XHA Y PD 
GRREEPQROKVGTSSRYRAORRNHrl'fELIEERENSNPFDNDEEE 
WTF E LG EM LLMLLAAGGDDELTDS E DEWDLFHDE LEDF Y DLDL 


6209 


1758 


829 


ERLCFPCM0SK1YSYMSPNKCSGMRFPLQEENSVTKHEVKCOGK 
PLAG I YRKRE EKRNAGNAVRSAMKSEEQKI KDARKGPLVPFPNQ 
KS E AAEP PKT PPS SCDSTNAAIAKQALKKP IKGKOAPRKKAQGK 
TQONRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
K3DLZDGKGRGVIATK0FSRGDFWEYHGDLIE1TDAKKREALY 
AQDPSTGCY^5YYFQYLSKTYCVDATRETNR1X5R1•INHSKCGNCQ 
TKLHDIDGVPHLILIASRDIAAGEELbYDYGDRSKASlEAHPWL 
KH 


6210 


3763 


387 


IFGKSKJjRMVLLEDSGSADFRRHFVNLSPFTITVVLLLSACFVT 
SSLGGTDKELRLVTX3ENKCSGRVXVK^0EEVTCTVCNNGWSKEAV 
SVI CNQIiGCPTAI KAPGKANSSAGSGRIWMDHVSCRGNESALVID 
CKHDGWGKHSNCTHCODAGTVTCSIXSSNLEMRli'TRGGNWCSGR I E 
I KFQGRVJGTV C DDN FN 1 DHAS VI CRQLE CGS AVS FSGS S NFG EG 
SGP1 WFDDL3 CNGNESALWNCKHCGWGKHNCDHAEDAGV J C£ KG 
ADLS LRLVTXS VTECSGRLE VR FOGEWGTI CDDG WDS YD AAVACK 
0LGCPTAVTA1GRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEEAGVTCSIX;SDbEljRLRGG^SRCAGTVEVEiQRL 
LGKVCDRGWGLKEADVVCRQ3jGCGSAI>KTSYQVYS K IQATNTWIj 



466 



BNSDOCI0: *WO___0153312A1_I_> 



WO 01/53312 



PCTAJSVO/34263 



SEO 
ID 
NO: 


Predictee 
becjinnirc 
nxjc^eot i at 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to firs i 
amino acid 
residue of 
amino acid 
sequence 


Amino tnd f.ec^cnt containing sional peptide 
(A-Alanme, C=Cystcine, D-Aspartic Acid, E= 
Glutamic Acic, F^phcnylal snine 4 G^Giycme, 
H=Histidinc, 1 =Iscj eucine, K=Lysine, 
L=Leucine, N=Methionine, N-Asparagint , 
p=Pro3j.ne, O^GJ uta:n2 ne, R=Arginine, 
S^Serine, T-Threoni ne , V«=Valine, 
W = Tryptophan, Y^Tyrosine, )U Unknown , * = Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FbSSCNGNETSLWDCKNWC^GGLTCDHYEtAKITCSAUREPRLV 
GGDI PCSGRVKVKHGDTWGSl CDSDFSLEAASVLCKELQCGTW 
S I LGGAK FGEGNGQ I WAEE PQCEGHE SHLSLCPVAPR PEGTCSH 
SRDVGVVCSRyTEIRLVKGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQQLKCGVALST PGGARFGKGNGOI WRHWFHCTGTEO 
HMGDCPVTALG/^LCPSEOVASVICSGNOSQTLSSCNSSSIjGPT 
RPTIPEESAVAC5ESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAJrWCROLGCGEAlNATGSAHFGEGTGPlWliDEMKCN 
GKESRIWOCHSKC-WGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGA.WGTVGKSSMSETTVGWCROliGCADKGKINP 
ASLDKAKSIPMWDNVOCPKGPDTLWOCPSSPWEKRLASPSEET 
WITCDNKIRL0EGPTSCSGRVEIWHGG5WGTVCDDSWDLDDAQV 
VCWLX5CGPALKAFKEAEFGOGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG I LGWIjLA I FVALFFLTKKRRORORLAVSSRGEKLVHQIQ YR 
EMNSCLNADDLDLKNS S GGK SE PH 


6211 


376: 


387 


ifgmsklrmvblensgsadfrrhfvnlspftitwlllsacfvt 
sslggtdkelrlvdgenkcsgrvevkvqeewgtvcnngwsmeav 
svi cnqlgcptai kapgiwssagsgri wkdhvscrgnesalwd 
ckhdgwgkhsncthc30dagvtcsdgsnlemrltrggnmcsgrie 
jkfcgrwgtvcddnfnidhasvjcrqlf.cgsavsfsgssnfgeg 
so p i wfddl i cngnes alv7nckhogwgkhncdhaedag vi cs kg 
adlslrlvdgvtecsgrlevrfqgewgt3cddgwdsydaavack 
0lgcptavta2 grvnas kgfghl wldsvscoghepavwqckhhe 
wg khy ckh n e dag vtcs dgs dlelrlrgggsr cagtveve 1 orl 
1.g kvcdrgwglkeadwcrqlgcgsaxkts yqvys k1gatntwl 
flsscngnetslwdcknwqwggltcdhyeeakitcsaiireprlv 
ggdi pcsgrvevkhgdtwgs 1 cds dfsleaasvlcrebocgtw 

SllGGAHFGEGNGOlWAEEFQCEGHESHljSbCPVAPRPEGTCSH 
SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCKSHHD 
1 EDAHVLCOOLKCGVALSTPGGARFG XGNGQ1 WRHMFHCTGTEQ 
HMGDCPVTAJjGASliCPSEOVASViCSGNOSOTLSSCNSSSLGPT 
RPT1 PEESAVAC1 ESGQLRLVNGGGRCAGRVE1YHEGSWGTICD 
DSWDLSDAHWCROLGCGEAI NATGSAHFGEGTGPI WLDEMKCN 
GKESRI»OCHSHGWGCONCRV5KEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYKGAWGTVGKSSMSETTVGVVCROLGCADKGKINP 
ASLDKAMS 1 PMWVDNVQCPKGPDTLYJQCPSSPWEKRLASPSEET 
W1TCDNK1RU?EG?TSCSGRVEIWHGGSWGTVCDDSWDLDDA0V 
VCQQLGCGPALKAFKEAEFGOGTGP1WLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVOKTPOKATTGRSSROSSFIA 
VGI 1>G V VL1A1 FV ALFFLT K KRRQRQ R LAVS SRG EN LVHQ 1 QYR 
EMNS CLNADDLDLKNSSGGH SEPH 


6212 


1 


1134 


LKWELRPGGAWGTGRGAGTGAPRSCCCQTNPGPPSSbRRAFRR 
RELPFPACHE I GLGAEAGSG P PPAPAARESRSRAMEEEASSPGL 
GCSKPHbEKLTLGI TRILESSPGVTEVTI I EKPPAERHM1 SSWE 
OKNNCVMPEDVKNFYLMTN-C-FHMTWS VKLDEHI I PLGSKAINSI 
SKLTOLTOSSMYSLPNAPTLADLEDDTKEASDDQPEKPHFDSRS 
VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYWHFLT 
DTFTAY Y RLLI THLGLPQWQ Y AFTS YG I S PQAXQRVSMYKP I T Y 
NTNLLTEETDSFVK KLDPS KV FKS KNK I V I PKKKGP VQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 




1134 


l»KWEliR PGGAVW7GRGAGTC APR SCCCQTN PGP PS S LRRAFRR 
REbPF PACHE I GLGAEAGSG P P PAP AARE SRSRAMEEE AS S PGL 
GCSKPHLEKLTLG I TR I LESS PGVTEVTI I EKPPAERHMI SSWE 
OKNN CVMPEBVKN F YLMTNG F HMTWSVKLDEHI IPLGSMAINS I 
S KLTO LTQS S MY S L PNAP T1»ADLEDDTH E AS DDQPE KP H FDS RS 
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BNSDOCID <WO 0153312A1_I_> 



WO 01/5331? 



PCT/US00/34263 



SEQ 
ID 

NO: 


Preciictec 
j 1 1 li i i iu 

nucleot isc 
location 
corresponding 
to first 
amino acirf 
residue of 
amino acid 
sequence 


Predicted end 1 
nucl eot ide 
location i 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine ?cici segment containing sjf-nal peptide 
< A= Ai an i ne , C— Cysteine, D»Ast>srtic Acid E- 
Glur ami c Acid, F-Phenyi alanine , G'Glycine, 
H*Kistidir.e, I-3soleucine , K-Lysme, 
L= Leucine, M=Methionine , N-Asparagine , 
P=Proline, O-Glutamine, RcArginint, 
S= Serine, T« Threonine, VaValine. 
W-Tryptophan, Y=Tyrosine. X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








VJFELDSCNGSGKVCLVYKSGKPALAEDTEJWrLbRALVWHFLT 
DTFTAyYRLLITHLGLPQWOYAETSYGISPOAKOKVSMYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVOPAGGOK 
GP£G?SGPS7SSTSKSSSGSGNPTRK 


6214 


2 


460 


KEltAPSAl RIIAAR1J2LGPARWQSRAAAFYFVRGFRTC-WSFVGWV 
VLGrSAKRTRLFFFLSKMAASSRAOVLALYRAMLRESKRFSAYK 
YRTY A VRR 1 RDAFRENKNVKDPVE 3 QTLVNKAK R DLGV 2 RRQVH 
IGC'LYSTDKLII ENRDMPRT 


6215 


2 


1649 


FVAGG PRG SGSAAETMPEI RVTP1.GAGQDVGRS C 1 LVS 1 AGKNV 
^',LDCG^5HMGF^^PDRRFPDFSYlTONGRl»TDFLDCVIISHFHLDH 
CGALP YFSEWGYDGP1 YMTHPT0A1 CPI LLEPYRK1AVDXKGE 
ANFFTSOM J KDCMKKWAVHLHCTVQVDDELE1 KAYYAGHVLGA 
AMFQI KVGSE£>ArYTGDYNMTPDRHLGAAWIDKCRPNLLITEST 
YATTJ RDSKRCRERDFL.KXVHETVERGGKVL1 PVFA1C-RAQELC 
ILLETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWTNOKIRKT 
FVOR TiMFE FK1I I KAFDRA FADNPG PM WFATPGMLHAGQS LOI F 
RKWAGNEKNMVlMPGYCV06TVGHKIl*SGQRKliEMEGRQVLEVK 
MOVE YMSFSAHADAKG1 MQbVGQAE PES V LLVHG EAK KME FLKQ 
Kl EOKLR VNCYM?AWGETVTLPTSPS I PVGISI-CLLKREMAOGL 
LPE A K X P R LLHG TLI MKDS N FR LVS S EQ A EK E LG 1 aAEHQLR FTC 
RVKLHDTRKEOETALRVY SH LKSVLKDHCVOK h PDG S VTVES VI* 
LOA/iAPSEDPGTXVIjLVSWTYODEELGSFLTSLLKKGIjPOAPS 


6216 


11 


3S3 


"OTTKPEPRNSALRGSRSKMAWGVSSVSRLLGRSRPOLGRPMSS 
GAHGEEGSARMWKTLTFFVALPGVAVSMliNVYLKSHHGEHERPE 
FIAY PHLRI RTKPr PWGDGNHTGFHNPHVNPLP'iGY EOE 


6217 


9 


2176 


TRVGRGKSGLKMEVKPPPGRPQPDSGRRRRRPCEEGHDPXEPEQ 
LJ?KLFIGGLSFETTDPSLREHFEKWGTLTDCV\^RI)POTKR£RG 
FGFVT YS CVE EVDAAMCAK PHKVDGR WEPKRAVS R EDS VKPGA 
HLTVKKI FVGG1KEDTEEYNLRDYFEKYGKIET3 EVMEDROSGK 
KRG FA F VTFEDH DTV DK 1 WQKY HT3 NGHNCEV K KA1»S KQEMQS 
AGSORGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 

NQGGGYGGGGX5YDGYNEGGNFGGGNYGGGG>T/>:DFGNYSGQQ0S 
NYGFMXGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6218 


230S 


90e 


£CERRGFIf/ 1 AI)DLKJ2FEYKXl,PSV^GLHAIVV£nRlx5VPV2KVA 
NDNAPEHALRPGFLSTFALATDQGSKLGLSKNKS 1 1 CYYNTYQV 
VQFNR 1»P LWS F I A S S S ANTG L I VS LE KELAP LF E ELR QVVEVS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPFAPRPHQRPFL 
IGVSGGTASGKSTVCEKIMELLG0NEVEQRQRKWIE5QDRFYK 
VLTAEOKAKAtKGOYNFTJHPDAFDNDLMHRTLKKlVEGKTVEVP 
TYDFVTHSRLPETTWYPADWLFEGI LVFYS0E3 RDKFHLRLF 
VDTES DVRLSRR VLRDVRKGRDLEQ3 LTQYTTFVK PAFEEPCLP 
TKKYADVI I PRGVPNMVAINL1 VQHIQDILNGDI CKWHRGGSNG 
RS YKRTFS EPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


E0N3 S LEMSCT1 EWOjADAKALVERLRDHDDAAESLI EQTTALN 
KRVEAJiKQYUKBJOELNEVARHRPRSTLVMGIO^E^OIRELOO 
ENKELRTSLEEH0SAJLEL1MSKYREQMFRLLMASKKJDDPGIIMK 
LKEQHSKIDMVHHNKSEGFFLDASRHILEAPOKGLERRHLEANQ 
WVH 


6221 


98 


916 


RWr^LNPVSDGLELRPKWGILHCLTTlWKLDGLRGLYOGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLC1TNPLWVTKTRLMLQYDAWNSPHR0YKGMFDTLVK1YK 
YEGVRGLYKGFVPGLFGTSHGALOFMAYELLKLKYKOHINRLPE 
AQLSTVEY I SVAALS KI FAVAATY P YQVVRARLODOHMFYSG VI 
DVI TKTVIRXEGVGGFYKGIAPNLI RVTPACCITFVVYENVSHPL 
LDLREKRK 
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ID 1 

NO: 1 

! 
i 


Prodi ctec 
bfga nninc 
nucl eot ide 
3oc3ticn 
corresponding 
tc first | 
amino acid 
residue oi 
amino acic 
sequence 


Predicted end 
nucl eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C*-Cystei ne , D^Aspartic Acid, 
Glutamic Acid, F=Phenylal anine , G=Glycme, 
H-Hi stidrr.e, l=Isol eucine , lULysanc, 
L-Leucine, M=Methionine , N-Aspareome , 
P=Prollne, 0=Glutamme, K-Argimn*., 
S-Serlne, T=Threonine, v^Valine. 
W= Tryptophan, Y=Tyrosine, X^Unxnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=posBible nucleotide insertion/ 


6222 


2 


2a 16 


MAKELRALLLWGRR LRPLLRAPALAAVPGG KP I hC PRRTTAQLG 
PRRNPAWSLOAGRLFSTOTAEDKEE PLUS I ISSTESVQGSTSKJ1 
EF0AETKK1.LMVARSWSEKEVK1REL3SNASDALEKLRHKLV 
SDGQALPEMEIKLQTNAEKGTITIODTGIGMTCEELVSNLGTIA 
RSGS KAFLDALQNQAEASS K I IGQFGVGFYSAFMVADR VE VYSR 
SAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKl 3 3HLXSDCKEF 
S S EARVRD WTK YS NFVSFP LYDNGRRMNTLQA 3 WMMDP KDVRE 
WOHEEFYRYv'AOAHDKPRYTLHYKTDAPLMIRSJFYVPDMKPSM 
FDVSRELGSSVALYSRKVLI Q7KATD3 LPKWLRF1 RGWDSKDI 
PLNLSRELLOESAI^IRKI^DVLQGRLIKFFIDOSKKOAEKYAKF 
FEDYGLFMR^GIVTATEOEVKEDI AKLLRYESSAiPSGQLTSLS 
E Y ASR MRAG TRN3 Y YLCAPNR H 1»AEHS P YY E A>i K KKDTEVLFCF 
EQFDELTLL.HLREFDKKKL3 SVETD3 WDHYKEEKFEDRSPAAE 
CLSEKETEEL^WMRNVXGSRVTNVKVTLRLDnJPAMVrVLEMG 
AARH FLRM00LAKTQEERAQ1 jLQ P TLE 1 N PRHAL 3 KKL.N QLRAS 
E PGLAQLLVDQI Y ENAM I AAGLV DDPRAMVGR LNELLVTCAL ERH 


6223 


3 


715 


DAKAR TMAGMVDFQDEEQVKSFLENMEVECNYHCYhEKDPDGCY 
RL VDYLEG 1 R XNFDEAAKV LK FN CK FNQHSDSCY KLGAY YVTGX 
GGLTQDL KAA ARC FLMACE K PG K K S I AAC H NVG L LAHDG QVNB D 
GO PDLGKAR DY YTRACDGG YTS S C FNLS AM FLOGAPG FP KDKDL 
ACKYS MXACDLGH I W ACANASRMY K LGDG VDKV E AKAEVLKNRA 
QOVHKEO0KCVQPLTFG 


6224 


1 


133 


LRTISSMAWGPLLLTLLAHCTGSWAOSVLTQPPSVSGARIPHEK 


6225 


3259 


938 


LLSCHRIAICKIjPFSVESRKTVMGPOGARROAFL^FGDVTVDFT 

okenriilspaoralyrevtlenyshlvslgilhskpelirrlfq 
gevpwgeerrrrpgpcagiyaehvl^pknlg1ju^orc?qolofsd 
qs fosdtaegoekekstkpmafs s p plrhavssr rrns we3 es 
sqgqrenpteidkvlkg1ensrwgafkcaerg0dfsrkmmvi1h 
kkahsrqklftcrechogfrdesall.lho^thtgeksyvcsvcg 
rgfsbkakllrhorthsgekpflckvcgrgytsxsyltvherth 
tgekpyecoecgrrfndkssynkhlkahsgekpfvckecgrgyt 
nksyfwhkr 3 hsgekpyrcoecgrgfsnkshli thorthsgek 
pfacrocxosfsvkgsllrhorthsgekpfvckjxtersfs0kst 
lvyh0rthsgekpfvcrecg0gfiokstlvkh0ithseekpfvc 
kdcgrgf10kstfti>hqrthseekpygcrecgrrfrdkssynkh 
lrahu5ekrf fcrdcgrg ftlk pnlt i hqrthsge kpfmckoce 
ksfslkanllrhqwthsgerpfnckdcgrgfilksrllfhokth 

SGEKPFICSECGGGFIWKSNIjVKHOLAHSGKOPFVCKECGRGFN 

wkgnli,thorthsgekpfvc>tvcgogfswkrsltrhhwrihske 
kpfvcqeckrgytsksdltvherihtgerpyecoecgrkfsnks 
yys khlkrh lrekrfctgs vgeass 


6226 


29 


266 


TKVSEl/LGGSORLFFLPLMRRLCRCGLGPRVSPMAGPRVEVDGS 
I MEGGGQS LR VSTGLS WLLS L PWRAQR 1 RAGRS Y A 


6227 


2581 


890 


MS AS S L.L EQR P KG OGN KVONGS VHO K DG LNDDD FE ? YbS POAR P 
NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPIfLTS 
YGOLSNGEPHFLPDAMFGOPGALGSTPFLGOHGFN1--KPSGIDFS 
AWGNNSSQGOSTQSSGYSSNYAYAPSSLGGAM1DG0SAFANETL 
NKAPGMNT3 DCGMAALXLGSTEVASNVPKVVGSAVGSGS I TSKI 
VASNSLPPAT3APPKPASWADIASKPAKQ0PKLKTXNGIAGSSL 
PPPPJ KHNMD 3 GTWDMKGPVAKAPSOALVONIGOPTOGSPQPVG 
QQANNS PPVAOASVGQQTQPLPPPP PQPAQLSVQQOAAQPTRWV 
APRKRGSGFGKNGVDGNGVGQSOAGSGSTPSEPHPVLEKLRSIW 
NYNPKDFDWNLKHGRVFI I KS YSEDDI HRS1 KYNI WCSTEHGNK 
R LDAAYRSMNGKG PVYLLFS WGSGH FCGVAEMKS AVDYNTCAG 
VWSODKWKGRFDVRWIFVKDVPNSOLRHIRLENNENKPVTNSRD 
TOEVPLEKAKQVLKI IASYKHTTS 1FDDFSHYEKR0 
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SEC 
ID 
NO: 


_ PrerCJ cted 
beginning 
nuciect ide 
local ion 
correspond! no 
to fiist 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to fiist 
amino ec:c 
residue o: 
amino scic 
sequence 


Amino r. cid segment containing sscjjr. 1 peptide 
(A-Aloninc, C=Cystei:if- , D-Aspartic Acid, E = 
Glutamic Acid, F-Phcnyla Janine . G-CUycine, 
H-Histidine, Islsoleucine, K-bysint , 
L- Leucine, M=Methionine, N=Asparac;; ne, 
P= Proline, Q=Glutanune , R=Arginir.t , 
S-Serine, T=7hreoninc. V^Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possibie nucleotide insertion) 


"6228™ 


47 


1578 


GRRCR RFC A VMELAQEAR EL G CWAVEEMGV P VAAfA PESTbR R L 
CLGOGAD3 VIAYILQHVHSORTVKKI RGNbLWYGKC.DSpQVRRXL 
El>EAAVTKLRAE3OEL»D0SLELMERDTEAQDTAf*lF0ARQHTODT 
OR RALLLRAQAGAM RRQQKTbRD PMQRLQNQI jRR LQDMERKAKV 
DVT FG SbTS AALGLEP WLRDV RTACTLRAQFLON bLbPQAKRG 
S LPTPKDDH FGTS YQOWLS SVETLLTNH P PGHVL/vALEHLAAER 
EAEIR£LCSGDGIX5DTEI£RPCAPDQSDSSQTLPSKVHLIQEGW 
RTVGVLVSQRSTLIjKEROVbTORLOGLVEEVEKRVLGSSEROVL 
1 LGLRR CC L WTELKAbHDOS QEbQDAAGHR QLbL R E bQAKQQR I 
LHWROLVEETQEOVRLLIKGNSASKTRLCRSPGtVLALVQRKVV 

TVLPSlHOLHPASPRGSSFlALrHKLGLPPGKASELLLPAAASL 
RODLLLLCDORSLWCWDLLHMKTSLPPGLPTOELLC) I QASQEKQ 
OKENLGOAbKRLEKbLKOALERIPELOGIVGDWVJEOPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALOKLCS 


6225 


2573 


560 


GPS LLGTRGTPNPARTLQIFFL 1 J GRRLTGKMAAVDDbQFEEFG 
NAATSLTAN PDA'XTVH I EDPGETPKHQPGS PRGSG RE EDDELLG 
NDDSDKTELLAGQKKSSPFWTPEYYQTFFDVDTYCVFDRIKGSL 
LP1PGKNKVRLYJRSNPDLYGPFWICATLVFAIA2SGNLSNFLJ 

HiA>r.J\ 1 jr. IV i'k,ir Ki\ Vi>4. AA1 1 i Y Ax AW L>V f liAbno t IjtINKjNoK 

VMM 1 VSYS FLE 3 VCVYGYSbF 1 7 1 PTA3 bVi 1 1 PH KAVRWI bVM2 
ALGISGSI.l^TFWPAVREDNRRVAliATIVTIVbl.hMLLSVGCL 
AY FFDAPEMDHLPTTTATPNQTVAAAKSS 


6230 


1723 


6 0C 


SKMSGKSGKKKMSKbSRSARAGVI FPVGRLMRYbKKGTFKYRlS 
VGA P V Y MAA V 1 EYLAAE I LELAC K T AARI3NKKAR I A F F H I LLAV7 i 

MTiTTVT MDT T trnfTT ft Cff U! nDIUDITr T TlVl/D^TV^UCFTTI.QP 
l»L/C-£.ljIVlJl'i-'«vtaV 1 1 Aduo v Iil'KJ Mt'n.Li-LirtMVKo I IV *o hot. X luor 

PPEKRGRKATSGKXGGKKSKAAKPRTSKKSKPKD'or: KEGTSNST 
SEDGPGDGFTILSSKSbVLGOKLSLTOSDISHlGSKRVEGIVHP 
TTA EIDbKEDI GKALEKAGGKEFLET VKEbR KS OG V LE VAEAA V 
SOSSGLAAKFVIHCHlPOWGSDKCEEOLEETIKNCbSAAEDKKL 
KSVAFPPFFSGRNCFPXQTAAOVTLKAlSAiiFDDSSASSLKNVY 
FLbFDSESl GJ YVOEMAKLDAK 


6231 


149 


870 


bl FSS S TMD RSLRNVLWS FGFbLLFTAYGGLQS bC'SSL Y S EBG 
LGVTALSTLVGGMLLSSMFbPFLb ■ ERLGCKGT3 I bSMCGYVAF 
SVGNFFACWYTLI PTSILbGLGAAPbWSAQCTYbTl TGNTHAEK 
AG KRG KDMVNQYFG IFFblFQSi: GVNGNLI S S bV FGQTPSQETL 
PEE QbTSCG AS DCLMATTTTNS TOR PSQOLV YT/LbG I YTGSGVb 
AVLMlAAFbOPlRDVQRESE 


6232 


3679 


14 76 


FVAGTTMAG FW VGTAPLVAAG RRGRW P PQOLMLS AALRTLKHVL 
YYSROCbMVSRNbGSVGYDPNEKTFDXILVANRGEjACRVIRTC 
KXtfGI KTVA 3 HSDVDASSWv/KMADEAVCVGPAPTS KSYLNMDA 
IMEAlKKTRAQAWPGYGFliSENKEFARCLAAEDVVFIGPDTRA 
I QAMGDKI ESKLLAKKAEVMT2 PGFDGVVKDAEEAVR 1AREXGY 
PVMI KASAGGGGKGMR I AWDDEE TRDG FRbSSQEAA £ SFGDDRb 
LI EKF X DN PRKI EI QVLGDKHGNALWLNERECS I OR SNQKWEE 
APS 3 F LDAET R)LAMGEQAV ALARA VKY S SAGTVE F b V DS K KN F Y 
FbEMNTR bO VEHP VTECI TGLDb VQEM I RVAKGY PLR HKQADIR 
I NGWAVECRV YAEDPYKSFGbPS 1 GRLSQYQEPLHbPGVRVDSG 
IQIX5SD3SlYYDPMISKLITYGSDRTEALKRMADAbDNYVIRGV 
THN1ALLREVJ INSRFVKGDISTKFLSDVYPDGFKGHMbTKJSEK 
NQLLA I AS Sb FVAFQLRAQHFOENSRMPVIKPD I AN KELS VKLH 
DKVHTWA5NNGSVFS VEVDGS KI .NVTSTWNbAS PbbSVS VDGT 
ORTVOCXSRFJiGGNMSIOFLGTVYKVNILTRl^EZ.NKF'MbEKV 
TEDTSSVbRSPMPG VWAVSVKPGDAVAEOPEI CV : EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLbVELE 


6233 


j 


2654 


HSTRENbNAGNPNFPSEGHLVRSTGPGGSFAKHMVAOCVSPKGP | 
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SEC 
NO: 


Tree ) ct ed 
begi nning 
p.dc] eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
locati on 
corresponding 
to first 
amine ocid 
residue of 
amino scic 
sequence 


Ami no acid r.egment contains rjo s^crial pff. tide 
(A=;Uanine, CeCystfine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylal sr.: ne , G-Glycir.fc, 
H=I!ist idine , l=Isoj euciue , Lysine, 
L=Leucine, X=Methicnine, N=Aspar agine, 
?=Proline, 0=Glutan.; ne, R=ArQjnine, 
S=Serine, 7=Threon:ne ( VrVc-me, 
W- Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /-=pc£6ible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRTYFrGATrlVPyLGGDSKLPKKTEOlRLLSOIYAAVIEAV 
l*AG I AC Y AKTS £ LTKA KE V A EQTLGSG 1 J) S FEL I P FKAAL R S KM 
TrHIHAVNNOGKJVPLDSCDSLSFVKTACMAVYDIPDLLGGNGC 
IJGS WFSES FLTS Ql LVXEXDGTVTTETS S WLTAAV ?P F CS WL 
VEDNEVKLSEKTHOAVRGL^SFLGTYLTGGEGAYLYSSNL-OSWP 
EEGNVHFFSSGLLFSHCRKGSIIISKDHMNSISFY-DGDSTSTVA 
ALLIDFKSSLLFHLPVHFHGSSNFLMlAi.KPKSKIYQAFVSEVF 

EKRSSLKLLSAKLPELDWF LOHFAISS I SQEPVKRTHLPVLLQQ 
AEIN7THRI ESDKVI I SI VTGLPGCHASEJjCAFLVTLHKECGRW 
MVYRQ3 MDSSECFHAAHFC RYLSSALEAOOMRSARQSAYIRKKT 
R LLWLQGY TD V I D WQALQTH PD5NV KASFTJGA3 TACVE PMS 
CYMEHRFLFPKCLDQCSOC-LVSNVVFTSIJTTEORKPLLVOLOSL 
3RAANPAAAF1 LAENGIVTRNEDIEU LSENSFSSPEMLRSRYL 
KYPGWYEGKI»NAGSVYPl>'VOICVWFGRPLEKTRFVAKCKAIQS 
SIKPSPFSGNlYKIl^KVKFSDSERTKEVCYNTLANSLSiMPVL 
EGPTPPPDSK^VSODSSGCCECYLVFIGCSLKEDEIKDWLROSA 
KOKPQRKALKI'RGMLTQOE J RSIHVKWH),EPLPAGYFYNGTQFV 
NFFGDKTDFHPI.MD0FMMCYVEEANRE3 EKYNQELEQQEYHDLF 
ELKP 


623 <i 


1733 




PRVREDMDHKSPGNKGSLVYAGIKS1VKSSL.GMVESSRHNWSGL 
DKOSDIONLNEKRILALOICGWIKKGTDVDVGPFLNSLVCEGEW 
ERAAAVALFNLD1 RRAIQI ^NEGASSEKGDLNLNWAMALSGYT 
DEKKSLWREHCSTLRLOl>NNPYLCVMFAFLrSETGSYDGVLYEN 
KVAVRDRVAFACKFliSDTOLNRYIEKLTKEMKEAGNLEGlbLTG 
LTKDGVDLMELYV/URTGDVCTASYCMl .QGSPLDVLKDERVQYWI 
EN YRNLLDAWK FiWKRAEFE J HRSKLDPSS K PLAQVFVS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
7a,CLINMGTPVSSCPGGTKSDEKVDL£KLiKKLA0FNNWFTWCKN 

OP 


6235 


1 


571 


E KR DHR L.PS W 1- ^AALK VPGR GGR VGTTP t LAAGG I MATR NPPPO 
DYESDDDSYEVLDLTEYARKHQWWNRVFGHSSGPMVEKYSVATQ 
1 WlCZCZVTGWCkan ,T?C) VVrVl £A^AVGGnFl.I»LiOI.nQttSGY"\/PI 

J V I IVjjVt V 1 Un^Ml.** r JUT Vijfv l_*rvri .* V v- V-' ^ i Lj XJl—J^' X -* v V 

DWKRVEKDWKAXRQI KKxANKAAPEI NNL I EEATEFIKONI VI 
SSGFVGGFLl.CLAS 


6236 


1 


703 


WDONKGAAAGSGLTLPSLPSA^FSAGPPTORSRPTMSNMEKHLF 
NLKFAAKELSF. S A KKCDKE E KAEKAKI KKA3 QXGNKEVAR I HAE 
N A I RQ KNQAVN F LRMS AR V DA VAAR.VQTA YTMGKVT KS MAG WK 
SMDATLKTMN) ,E K ISALMOK FEHQFETI ,DV0TO0MEDTMS STTT 
LTTPQNQVDKLLOEMADEAGLDLNMELPQGOTGSVGTSVASAEC 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEG 1 AAGGW.DVN 1 ALQEVLKT ALi HDGLARG I R FAAKA 
LDKROAHLCVLASNCDEPM YV KLVEALCAEHQ I NL1 KVDDNKKL 
GEVfVGLCKIDREGKPRKWGCSCVWKDYGKESQAKDVIEEYFK 
CKK 


623* 


2 


4666 


EBVPT0ESVKWE1NV1IKNPEIVFVADMTKNPAPALV1TT0CEI 
CYKGNLENSTMTAAI KDLQVRACPFLPVKKKGKI TTVLOPCDLF 
YQTTQKGTD PQ V I DtfSVKSLTLKVSPVI 1KTMITITSALYTTKE 
TIPEETA5STAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMI KMNIDSI FIVLEAG • GHRTVPMLLA.KSRFSGEGKjn^SSL 
I NL»H COUELE VH Y YNEMFG VWEPLLE P LE J DQTEDFRPWNJjGI K 
MKKKAKMAI VFSDPEEENYKVPEYKTVISFKSKDOLNITLSKCG j 
LVMLNNLVKAFTZ AATGSS ADFVKDLAPFK I I>NSLGIiTI S VS PS 
DSFSVLMI PMAKSYVLKNG£SL£MDYI RTKDNDHFNAMTSLSSK 
LFF3 LLTPVKK S TADX1 PLTKVGRRLYTVRHRES OVERS I VCQI 
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SEQ. 
IP 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucl eot ide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino acic segment ccntaining signal peptide 
{A*A2anine, C-Cysteine, D=Aspartic 7\ciri, 
Glutamic Acid, F=?henyialanine / G=Glycine, 
H=Histidme, I = lsolcucine , lUbysine, 
b=beucine, M-Methionine, N=Asparagine , 
P^Proline, 0=Gluta:rune, R=Arginine, 
S=Serine, T^Threonine , V= Valine, 
w = Tryptophan, Y*-Tyrosine, X»Unknown, *=»Stop 
Codon, /=possib;e nucleotide deletion, 
Vpossible nucleotide insertion) 








DTVEGSKKVTIRSPVQIRNHFSVPLSVYEGDTLIiGTASPENEPN 
JPLGSYRSFI FLKPEDtNYGMCEGI DF5E1 1 KNDGAbbKKKCRS 
KNPSKE5FLINI VPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYK 1 A Y Y I EG I ENS VT TLSEGHS AQ I CTAQLGKARLHLX L 
LDYLNHDWKS EYH 1 KPNQQD1 S FVS FTCVTEMEKTDbDl AVHMT 
YNTGQTVVAFHSPYWMVNKTGRML0YKADG1HRKHPPNYKKPV1, 
FSFQPNHFFJWHKVObMVTDEELSNQFSlDTVGSHGAVKCKGLK 
MDYQVGVT3DLSSFN1TRIVTFTPFYM1KNKSKYHISVAEEGKD 
KWLSLDLE0CIPFWPEYASSKLLIQVERSEDPPKR1YKNKQKNC 
1 LLR LDN E bGG 1 1 A E VNLAEH S T V I TFbD Y H DG AATFLL 1NHTK 
NELVQYNOSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTOKDDMHMPIDLGEKTIYLVSFFEGLORHLFTEDPRVFK 
VTYESEKAE LAEQEI AVAbQDVG I S LVNNYTKQE VAYI G 1 TSSD 
WWETKPKKKARWKPMSVKHTEKLEREFKEYTESSPSEDKVIQL 
DTNVPVRbT PTGHNMKI U?PHV3 AbRRNYLPAbKVEYNTSAHOS 
SFRIQIYR 3 01 QNQ1 HGAVFPFVFY PVKPPKS VTMDSAPXPFTD 
VS 1 VMRSAGHSQI SR3KYFKVL3 QEMDLRLDLGFI YALT DbMU'E 
AEVTEWTEVELFHKDIEAFKEEyKTASLVDQSOVSLYEYFHISP 
IKLHLSVSI.SSGREEAKDSKONGGLIPVUSLNLLLKSIGATLTD 
VQDWFKLAFFELNYQFHTTSDLQSEVIRHYSKOAIKOMYVLIL 
GLDVLGNPPGLI R E FSEGVEA FFYE PYQGA I QG PEEFVEGMAXG 
LKALVGGAVGCL-^CAASK I TGAMAKGVAAMTMDEDYOOKRREAM 
NKOPAGFREGITRGGKGLVSGFVSG 1TG3 VTKP I KGAQKGGAAG 
FFKGVGKGLVGAVARPTGG 1 1 DMASSTFQG1 KRATETSEVESLR 
PPRFFNEDGVIRPyRLRDGTGNOMbOKlQFYREWlMTHSSSSDD 
DDDDDDDDESDL^H 


6239 


2108 




K PGMAGKGS SGRR PbLLGbbVAVATVWLVICP Y TKVEES FNLQA 
TKDLbYHWODbEQYXHLEFPGVVPRTFbGPVVIAVFSSPA^/YVl 
SLLEMSKFYSQbl VRGVLGLGV J FGL WTLQ K£ VRR H FG AM VATK 
FCVm'AMOFHbMFYCTRTbPrm^PWLLAliAAWLRKEWARFI j 
WLSAFA3 1 VFR V2LCLFLGLLLLbALGNRKVSWRAbRHAVPAG 
I LCLGLTVAVDS Y FWRQLTW PEG KVLWYNTVbNKS 51WGTS PLL 
WYFYSAbPRGLGCSLLFl PbGLVDRRTflAPTVLALGFKALYSbb 
PHKELRF1I YAF?MLNITAARGCSYLLNNYKKSWLYKAGSLLV3 ! 
GHLWNAAYS ATAbYVSH FNYPGGV AMQRLHQbVP PCTDVbbHI 
DV AAAQTGV S R FLQVUSAWRY DKREDVQPGTGMLAYTHI bMEAA 
PGbbALYRDTKRVLASWGTTGVSLNbTQLPPFNVHbCTKLVLb 
ERLPRPS 


0240 


2202 


1176 


HERGDSLKEPTS1AESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DFLSSGSRSSSbKSAQGTGFSLGOLOSXRSEGTTSTSYKSbANO 
TRNGSLSYDSbLTPSDSPDF^SVOAGPEPDPPLGYTSPFLSARb 
AOQREAERH PR LVPTGPTHR 2PS FVR YDNLSRH1 VASLQEREKL 
LRQS PPLPGREEEPGLGDSGI OSTPGSGHAPRTSSSSDDS KRSP 
LGKTPIiGRPAVPR FGKPDGLRGRG VGSPfcrwr lAr i lA?Kb no I o 
SOKAQPGVSETEEVAXOPLbTPKDEVObKTTYSKSNGQPKSIiGS 
ASPGPGQPPbSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEEKXRbSbQREKJ 1ARVS I DNRTRAbVQALRRTTDP K bC I T 
RVEEbTFHbbEFPEGKGVAVKERIl PYLLRLRQ1 KDETLOAAVR 
EI LAbI GYVfcPV KGRGIR IbSl DGGGTRGWAXQTLRKLVELTQ 
KPVHQLFDY I CGVS TGM LAFMLGbFHMPLDECEEbY RKLGS DV 
FSQNVIVGTVKMSWSHAFYDSQTWENIbKDRMGSALMIETARNP 
TCPKVAAVST I VTJRG 3 TPKAFVFRNYGHFPGINSH YbGGCOY KM 
WQA J RA SS AA PG Y FAJS YAIX?NDbHQIXK?bbIjNrf PS AIiAMHECKC 
bWPDVPLECIVSbGTGRYESDVRNTVTYTSbKTKbSNVINSATD 
TEEVHrMbDGbbPPDTYFRFKPVMCENIPbDESRNEKbDOLQbE 
GbKYIERNECJKWKIO/AKILSQEkTTLOKINDWIKbKTDMYEGbP 
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SEQ 
IE 
NO: 


Predicted 
bcginni ng 
nucleot ide 
1 oca t ion 
corresponding 
to first 
amino acid 
resicue ol 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino r-cid secment containing signal peptide 
(A^Aianne, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylnlonine, G«Glycine, 
H*HiGtidine, 1 =2 soleuci ne , K^Lysine, 
L=Leucme, M=Mc t hionine , N=Asparagine , 
prProlme, O-Glutamine, R=Arginine, 
S=Serine, T-Threonine, v=Valine, 
V?~Trypt ophan, Y=Tyrosine, X=unknown, *=Stop 
Codon, /epopsibie nucleotide deletion, 
\=possible nucleotide insertion) 








FFSKL 


6242 


lye 


1310 


QHFLPGAETWSP3AAVCTARSFPGRSLAAFPRPAAPRRAVEMGE 
SSEDIDCMFSTbLGEMDLLTOSLGVDTLPPPDPNPPRAEFWYSV 
GFKDLNE SLNAIjE DODLDALMADLV ADI SEAEORTI QAQKESLQ 
NQHHSASl.yASI FSGAASLGYGTNVAATGISQYEDDLPPPPAEP 
VLDLP LP PP PPE PLSQEEEEAQAKADKI KLALEKLKEAXVKKLV 
VKVHMNDNSTKSLMVDERQLARDVLDNLFBKTHCDCNVDWCLYE 
IYPELOIPRFFEDHENVVEVXjSPWTRDTEKKILFLEKEEKYAVF 
KNPONFYLDNRGKKESKETI^KhlNAiCNKESLLEVRLlLOSGRKE 
KDVCS I F KS FAS EWGK1 


6243 


2509 


614 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TSR AS SRRLACG PQ7RAGAETRSTAMI RANS AARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWAST1TTGCCPA 
MGQAGAGFAGRKGSEAGGGPGPJUiRAHPSPLPREPRVRTGPPAH 
SPTPGSI DPS PELSWGSAGVTQES PLLDFVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFS3QPQPPP01AFHPRDFPAGTKRQLILVPLK 
GPPI LAP I LSLTP 1 1 SRWSCYFPRSR I AQGWHL5 


6244 


2119 


1745 


FEHAYASOFGTFLGNNESERCKLKLQQKTMSLWSWWQPSELSK 
FTN P L F L* ANTJ L V I WPS VAPQSLPLWEGI F1RWNRSSKYLDFJVYE 
EMVN I I E YNKELQAKVN J LRRQLAFXETECGMQES P 


6245 


81 


1 14B 


LSLRNAKY£FP0ELISLFSMTD1>NDNICKRY1KMITN1V3LSLJ 
I CI 5 LAFWI I SMTAST Y YGNLRP 1 3 PWRWL FSVWP VL1 VSNGL 
KKKSLDHSGALGGLWG FI LT1ANFS FFTSLLMFFLSS S KLTKW 
KGEV KKRLDSE Y KEGGORNWVQVPCNGAVPTELALLYK1 ENGPG 
EIPVDFS KOYSASWMCLSLliAALACSAGDTWASEVGPVLSKSSP . 
RL 1 TTWE K V P VGTNGGVTWGLVSS LLGGTFVG I AY FLTQLI FV 
NDLDT SA PQWP 1 1 AFGG LAGLLGS I VDS YLGATMQYTGLDESTG 
MWNS PTNKARH I AGK P I LDNNAVNLFS S VL I ALLLPTAAWG FW 
PRG 


6246 


2177 


359 


S liWP Vi I LMDDS LMQ I SLOLLCVYTAN FPNGCS S LCWSSCGQHPV 
OATHRGAVSNSI^LCILKLASQMFLENTTVOQMVFMLLSNLALS 
HDCKGV10KSNFL0NFLSLAI>PKGGNKHLSNLTlbWLKLLLNIS 
SGEDGOOMlLRLL»Ga,DLLTEMSKYFJ4KSSPLLPLLlFHNVCFS 
PANKPKII ANEXV I TVLAACLESENQNAQRIGAAALWALI YNYO 
KAKTALK 5 PS VKR R VDEAYSLAKKTFPNSEANPLNAY Yl» KCLEN 
LVQLLNSS 


6247 


3 


1676 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDDTSHAGP 
PGPGRALbECDHLRSGVPGGRRRKDWSCSLbVASLAGAFGSSFL 
YGYNLS\T/NAPTPY I KAFYWESWERRHGRPIDPDTLTL1.WS VTV 
SI FA1 GG!,VGTLI VK.M1GICVLGRKHTL1JWNGPAISAALLMACS 
LOAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
OVTAI FI CJGVFTGObLGLPELLGKESTWpyXFGVI WPAWOL 
LSLPFLPDSPRYLLLSKHNEARAVKAFOTFLGKAHVS0EVEEVL 
AFCJRVORc TRT.V^VT.fr.T UaPYVRWnWTVI VTMACYOIjCGT.NA 
I WFYTNS I FGKAG I PPAKI P YVTLSTGG I ETLAAVFSGtiVI EHL 
GR R PLL I GGFGLMG LFFCTLTT TLTLQDHAPWVPYLS 1 VG I LA I 
IASFCSGPGGI PFI LTGEFFOQSQRPAAFI I AGTVIWL5NPAVG 
LLFPPIOKSLDTYCFLVFATICITGAIYLYFVLPETKNRTYAEI 
S OAFS XRN XAYPPE EKI DSAVTDGKI NGR P 


6248 


56- 


1773 


VPPPR MMAAVP PGLE ?WNR VR I PKAGNRS AVTVG/N PG AALDLCI 
AAVI KECHLVI LSLKSOTLDAETDVLCAVLYSKKNRMGRHKPHL 
ALKQVEQC L.KRXWv'MNLEGS I ODLFELFS SNENOPLTTKVCVVP 
S 0 P VV E LV LMKVLG ACKLLLR LL0CC CKTFLLTV KHl iGLQE F 1 1 
LNLVM VGLVSRXWVLYXGVLKRIil LL YEPLFGLLQEVARI QPMP 
Y FKDFTFPSD1 TE FLGQPYFEAFKKKMPI AFAAKG I NKLLNKLF 
L I N EQS PRAS EETLLG I SKKAKQMKINVQNNVDLGQP VKNKRVF 
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SEQ 
ID 
NO: 


beginmr.t 
nucl eot ir < 
loco t ioi: 
correspor.<:mg 
to first 
amino acjf 
residue c: 
amino acjc 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acid segment, containing siona: peptide 
(AsA'ianine, C=Cysleine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalani ne . G=Giycine, 
H=Histidine, I^Isoleucine, X=Lysmt, 
L=Leucine, M=Methionine, N=Asparacine , 
Peproline, Q=Glutamine, R^Argimne, 
S=Senne, T= Threonine, v=:Valine, 
W= Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codor., /=possibie nucleotide deletion, 
\=possible nucleotide insertion} 








KEESSEFDVRAFCNOLKHKATpETSFDFKCSOSRLKTTKYSSOK 
VI GTPHAXSFVQRFREAESFTQLS EE1 OKAWWCRS KKLKAQAI 
FbGnXLLKSNRLKHLEAOGTSLPKKLEClKTSlCNKLLRGSGIK 
TS KKH LRQRRSONKFLRRQRKPQR KIXJSTLLRE 1 00 FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLY PNS KCLLNSG VSMPV 1 QTKEKMI 
KENLRG 1 HENBTDSWTVMC INKNSTSGT J KETDDl DD I FALMGV 




Cf. 


1773 


VP PPR MMAA VP PGLE PWNR VR I PKAGNRSAVTVONPGAALDLC I 
AAVI KECHbVIISLKSQTl DAETDVLCAVLYSNHNRMGRHXPHL 
ALKOVEOCLKR LKNMNLEGS 3 QDLFELFSSN ENOPLTTKVCVVP 
SO PWELVLMKVIjGACKLLIjRLLDCCCKTFLLTVKHLGIjOEFI I 
LNLVMVGLVSRLWVLYKGVLKRLILLYEP1.FGLL0EVARIQPMP 
YFKDFTFPSDITEFLG0PYFEAFKKKMPIAFAAKG1WKL.LNKLF 
LINEOSPRASEETLLGISKKAKOMKINVONNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSOSRLKTTKYSSOK 
VI GT PKAKS FVQRFREAES FTOLS E E I QMAWWCRS K KbXAQAl 

r IA^NAJjIj AblVK Dr^btAVA^ 1 oL»r'r\Mjfc.\_J ft J o 1 V„r*rlL»L»KV»t>v>l JV 

TSKHHLRQRRRONKFLRRORKPQRKIjOSTLLREIQOFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLI.NSGVSKPVIOTKEKMI 
HENLRGIHENETDSWTVMOIKKNSTSGTlKETDDIDDIFAbMGV 


6250 


" 231 " 


1306 


LiAALH I MALPFRKDLEKY KDLDED EI>1X3NLSETELK0LETVLDD 
LDPENAJjLPAGFRQKNOTSKSTTGPFDREHLLSYbEKEALEHKD 
REDYVPYTGEKKGKI F3 PKQKPVQTFTEEKVSLDPELEEAL.TSA 

EKI LPVFDEPPNPTNVEFPLKRTXENDAHLVEVNLNNI KNIPI P 
TLK D FAKALETNTHVKC FSLAATR S NDPVATAFAEMLKVNXTLK 
SLNVESNF3 TGVG1LAL1 DALRDNETLAELXl DNQRQQLGTAVE 
LEMAKMLEENTYJILKFGYOFTOX^GPRTRAANAITKNNDLiVRKRR 
VEGDHC. 


6251 


62 


972 


TPGSGPMSAWAAASliSRA/'ARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWSKVRM KGPKDVERSRI FS 
KLCLNIRIjAVKEGGPNPERNSNLANILEVCRSKHMPKSTIETAL 
X^EKSKOTYLLYEGRGPGGSSLLIEAI^NSSHKCOADIRHILNK 
NGG VMAVGAR HS FDKKG V I WEVEDREKKAVNLERALEKA I EAG 
AEDVKETEDEEERNVFKFICDASSLKQVRKKLDSLGLCSVSCAL 
EF2 PNS K VOLAE PDLEQAAH LI QAliS NHEDV I HVYDN 1 E 


6252 


27 


1897 


EEFCTWIAVRVGEWETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVP TTAGAS PGPPRNK KNRELRPORPXNAY I LX KSR IS KXPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWOKFCRIDKSR 
K3j PH S KA KTR SR LEVAEAE EEETS I KAAR SE1>1»1AEE PG FLEGE 
DGEDTAKICQADIVEAVDIASAAKHFDLNLROFGPYRLMYSRTG 
RHLAFGGRRGHVAALDW^TKXLMCEJMVMEAVRDIRFLHSEALL 
AVAONR WLHI YDNQGIELHCI RRCDRVTRLEFLPFHFIliATASE 
TGFIjTYLDVSVGKI VAAiNART^RLDVMSONPYTIAVI HLGHSNG 

tvslwspawkeplakilchrggvravavdstgtymatsgldhql 
kifdlrgtyoplstrtlphgaghlafsorgllvagmgdwniwa 
gogkas p psleqp ylturlsgpvhglqfcpfedvlgvghtgg i t 
smlvpgagepnfdgleskpyrsrkqroewevkallekvpaelic 

LD PRAT AE VD VI S JL.EQG K K E Q I ERLG YDPOAKAPFQ PX PKQKGR 
SSTASLVKRXRKVMDEEHRDKVR0SLO0QHHKEAKAKPTGAR PS 
ALDRFVR 


6253 


27 


1897 


EEFCTVnAVRVGEMETAFXPGXTjVPPKXJ)K10TKRKKPRRVWEE 
ETVPTTAGASPGPPRNKKN R ELRPCR P KNAY ILKKSR I SKKPQV 
PKKPREWKJNPESQRGLSGAODPFPGPAPVPVEWQKFCRIDKSR 
KbPHSKAKTRSRLEVAEAEEEETSlKAARSELLLAEEPGFLEGE 
DG EDTAK I CQAD I VEAVD I AS AAKHFDLNLRQFGP YR LNYSRTG 
RHLAFGGRRGHVAALDWVTKKI^CEINVMEAVRDIRFLHSEALL 
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SEQ "' ^ 
ID 
NO: 


Predictor 
beginning 
nuci eotirif 
location 
ccr respond l no 
to first 
amino acid 
residue oi 
amino acic; 
sequence 


Predicted end 
rjuclect i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid oe<jtnent com t.imny ftjcnal peptide 
<A=A] c-mne , C^Cysteinc , I^Aspart i c Acid, E^ 
Glutamic Acic, F=phenyl fei anine , G-^Glyci 
H=Hi st i dine, 2 =Isoleuc : r.e , Lysine , 
L>=Leucine, ^Methionine , N=Asparacme , 
P=Proline, Q=Glutamine, ,?=Ar<jim r.e , 
S = Senne» X=Threonine , v=\/alint-, 
VUTryptophan, Y"TyroGinc, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\ = possible nucleotide insertion- 









AVAQNRWLH ■ YDNOGIELHClKH.DRVTRLtrLPFHFLLATASE 
TG? LTVLDVSVGK1 VAAUNAR As-r. LEVMSQNPYNAV3 HLGHSNG 
T VSt jWS P AMK EPLAKI LCHRGG V K A VAVDSTGT YMATSGLDHQli 
KlFDLRGTYOPLSTRTLPHGAGH^/iFSQKGLLVAGMGDWNIWA 
GOGKASPPSLEOPyLTHRLSGPV.^GLOFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPVRSVKOROEWEVKAbLEKVPAELlC 
LD PRALAEVD V I £ LEQG KKEQ I r. K 1 wYDPOAKAP FQ P KP KOKGR 
SS TA5LVKRKR KVMDEEHRDK V h C S L0O0HH K EAKAK PTGARPS 
ALDRFVR 


6254 


I5i: 


113S 


HA LGRRGGSOE LS AAACGCFAX?. LRAPGSGR PALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLXEYkI CMPLTVDEYKIGQLYMISKH 
SHEOSDRGEGVEWQNEPFEDPJ-IGNGQFTEKRVYLNSKLPSWA 
RAW PK 1 FYVTEKAWNYYPYTJ ?E YTCS FLP KFSIHIETKY EDN 
KGSNDT1 FDNEAKDVEREVCF1 L' 1 .hCDEI PERYYKESEDPKHFK 
SEKTGR60LREGWRDSHQPIMCSVKLVTVKFEVWGLQTRVEQFV 
HKWRDl LLI G HRGA FAWVDEW Y 1 WMDDVR EYE KNNKEQTNI K 
VCNQHSSPVDD1ESHAQTST 


625t 


3 


1444 


PTRPCX>ELLVS1ATV1FVAS0K7-A S VESKAV1 KQQLESVSNGWT 
VYR I ARQAS RMGNHDMAKELYO^ 1 - 1 .TQVASKH FY FWLNSLKEFS 
HAEOCLTGL0E ENY5 S ALSC1 A E 5 I .K FYU KG I AS LTAAS T PLNP 
LS FQCEFVKLR J DLLQAFSQLl C7 CNSLKTS PPPAI ATT I AMTL 
GNDLORCGRlSNOMKQSMEEFRFLASRYGDliYOASFDADSATLR 
NVELQOOSCLLISHAIEALILDP'SASFOEYGSTGTAHADSEYE 
RRNMSVYNHVI,EEVESLNGKYTP v rYMHTACl.CNAIIALLKVPL 
S FOR Y F FQKWS TS I K1ALS PS r !, U PAEPIAV0NNQOLALKVEG 
WQHGSKPGLFRKI QSVCLNVSS 1 1 ,0SKSGQDY K I PI DNMTNEM 
E0RVEPHNDYFST0FLLNFA1LG1 i :NI TVESSVKDANG3 VWKTG 
PRTTI F\n<SLEDPYSQQIRL(XCCAOOPL<XX>OORNAYTRF 


6256 


1 


1542 


CRGAGAE PAAN PR S PRS LVPSLE 5 1 STSVPFAPGTMATDSWALA 
VDEQEAAAESLSNLHLXEEKIKF 1 TNGAWKTNANAEKTDEEEK 
EDRAAOSLLNKLIRSNLVDNTNOVEVLORDPNSFLYSVKSFEEL 
RLK PQLLQG" VYAMG FNR PSKI QE NAL PLMLAEP PQNLI AQSQSG 
TGKTAAFVLAM1>SQVEPANKYP0^ ~CLSPTYE LALQTG XV 1 EQM 
GKFYPELKLAYAVRGNKLERGQK ■ £ E Q I V I GT PG TVLD WCS KLK 
F2 DPKK I KVFVLDEADVMIATOGi X OQS I Rl QRNLPRNCQMLLF 
SA1 r tlJb VWArAVKV VPl>rWVlf%Iir.hc<tt iLdJ 1 J A-v* 1 vkloor 
DEKFQALCNLYGAITI AQAMI FCH7K KTASWLAAELSKEGHQVA 
LLSGEMMVEQRAAV I EK FREGKE H V L V1TNVCARG I DVEQVS W 
I NFDLPVDKDGN PDNETYLHRI GF TG R FGKRGLA VNMVDS KHSM 

nilkriohhfnkkierldtddldl: EKIAN 


6257 


210 


615 


AFT PAMAELIOKKL0GEVEKYQ01 t KDLSKSMSGRQKLEAQLTE 
MNI VKEE LALLEK3SNWFKLLGP VL-VKQELGE ARATVG KRLDY I 
TAEI KR YESQLRPLERQS BQQRETLAOLOOE FOR AQAAKAGAPG 
KA 


6258 


210 


615 


AFIPAMAEblOKKWEVEKYWLCKDLSKSKSGROKLEAQLTB 
NNIVKEEliALLDGSNVVFKLLG^Vl VKQELGEARATVGKPJLDYI 
TAE1KRYESOLKDLERQSBOX?RE7LAOLQOEFORAOAAXAGAPG 
KA 


6259 


2 

• 


1540 


1LEKGFPSQCHFERKWKVDDVLESS0ENEDDHFWELLFHNNKTV 
SVSNGDRGSKTFNLGTDPVSLRNYrYKlCDSCEKKLKNISGLII 
SKJC^CSRKKPDEFWCEXLLLDIk^KIPIGEKSYKYDOKRNAI 
N YHODLS QPSFGOS FEYSKKGOG FK Z EAAFFTMKRSQ I GETVCK 
VNECGRTFIESLKLNISQRPHLEKEPYGCSICGKSFCMNLRFGH 
0RALTKDNPYEYNEYGEIFCDNSAF1 1HQGAYTRKILREYKVSD 
KTWEKSALLKHOI^MGGKSYDWENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHCKTHTGEKPYECTECGXAFCQ 
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SEC 
IP 
NO: 


Predict e6 
beginning 
nucleot ide 
ioretion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predictec. t-nd 
nucleotidf 
location 
corresponds ng 
to first 
amino acic. 
x es i due oJ 
amino acic 
sequence 


Ammo acid segmer.i. containing serial peptide 
tA=Alanine, C^Cysteme, V=J\s,xj?.v t ic Acid, E= 
Glutamic Acid, F- Phenyl a lan- r.c ■, G=Glycine, 
H=}?istidine, 1= Isoleucine , Kriysine, 
L- Leucine, M-Methi onine , N=Ar p^ragine , 
P=Proline, Q=G3 ut^mine , R^Arca nine , 
o = i>erinc , i = inreonine ; vs vsxiric , 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKOCGKTFCVK. c NLTEHORTHTGEKP 
YECNACX5KSEC}JRSALVVHORTHTGEKPF5CNECGKSFCVKSNIi 
IVHQRTHTGEKPYKCNECGKTFCEKSAX7KH0RTHTGEKPYECN 
ACGKTFSQRSVLTKHQRIKTRVKALSTS 


6260 


20B1 


1436 


GTG PE 1 HACAHAS ARAPGSR AMALRELKVC LLGDTG VGKS S I WJ 
RFVEDSFDPN1NPT3GASFMTKTVQY0NELHKFLIWDTAGQERF 
RALAPMYYRGSAAA11VYD3TKEETFSTJ.KNWVKELRQHGPPN1 
VVAI AGNKCDL3 DVR E VK< £ RDAXD YADS 1 MA 1 FVETSAKNAINI 
NELF1E1SRR1 PSTDANLPSGGKGFKLRRGPSEPKRSCC 


6261 




1188 


FWYRLGPGTRSRWpRRGSWAASLVPRGPSFAALVTSPCPPDPLR 
SPACEPCRPDFAPRPALI.LRSGPRSAPAVTGKPALKGQPGPWPG 
l*AE VS I DOS KLPG VKEVC R DFAVLEDHTLAHS LQEQE 3 EH H LAS 
NV'ORNRLVOHDLOVAKO LQEEDLKAOAOLC^K R YK DLEQODCEI A 
0E IQEKLA I EAERRR 1 OEX KDED 3 ARLLCEKELQEEKKRKXHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTFSRMAHRDQEWYDA 
ElAJ^KLOEEELI^TOVPKRAAOVAQDEEI/^KLLMAEEKXAYKKA 
KEREKSSLDXRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
KPPPPIMTDGEDADYTHFTNOOSSTRHFSKSESSHKGFHYKH 


6262 


'< 




PECHSOGLCSVHRPGKVPOARWSGLVLGQP.UEPAGHRLSOKKiL 
CSTRLVSGGLEALRSKHQAVWSLSQTIECLQOGGHEEGLVHEK 
ARObRRSKENIELGLSEAQVMLALASHLSTVESEKOKLRAQVRR 
LCQ ENQWLRDELAGTOQR LQRS EQAVAOLF E E KKHLE FLGQLRQ 
YDEDGHTSEEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQOGG Y EI PARLRTLHNLV I QYAAQGRYEVAVPLCKQALEDLER 
T SGRGH P DV ATMLN I LAb V Y RDQN KY KEAAK LLN D ALS IRE STL 
GPDH PAVAATLNN LAVLY G KRGKYKEAEPLCORALE I REKVLGT 
NKPDVAKQL1WLALLCCTJQGKY E AVERY YCRALA 1 Y EGQLG PDN 
FNVARTKNN LASCYLKQG KYAEAETLY KE 3 i .TRAHVQEFGS VDD 
DHKPIWMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSSPTV 
NTTLFOJLGALYRROGKLEAAETLEECALRS'RROGTDPJSQTKVA 
EL1.GESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2406 


RELDSLADLPERI KPFYAA'GLSTSHLRSSSVEDVKLI 3SEGRPT 
I EVRRCSMPSV 1 CEHTKC FQT1 SEESNQGS LLTVPGDTSPS PKP 
EVFSNVPERDLSNVSN3 HSSFATSPTGASNSKYVSADRNLI KNT 
APVN'J'VMDSPVIILEPSSOVGVIOKKSWEMPVDRLETLSTRDFIC 
PNSN I PDQE ££LQS FCN S ENKVL KEN AD FLSLK 0 TELPGNS CAQ 
DPAS FM P PQQPCS FPSQS 1 -SDAES 1 S KHMS LS YVANQEPG I LQQ 
KNAVQI T SSALDTDNESTKDTENTFVLGDVQKTDAFVPVYSDST 
3 OEAS PNFEKAYTLPVLFSEKDFNGSDASTOLNTHYAFSKLTYK 
SSSGHEVENSTTDTQVISKEKENKLESLVL'J-HLSRCDSDLCEMN 
AGKPKGNLNEQDPKHCPES EKCLLSI EDEESQQS I LSSLENHSQ 
OSTQPEMHKYGQLVKVELEENAEDDKTEN01PORMTRIWANTMA 
NOS KQ I LAS CTLLS EKDS ESSS PRGR I RLTEDDD PQ I HH PR KRK 
VSRVPQPVQVSPSLliQAKEKTOOSLAAIVDSLKLDEIQPYSSER 
AJ4PYFEYLHIRKKIEEKRKLLCSVlPQA?0rYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQgEWRMKLRLQHSIE 
REKLIVSNEQEVLRVh^RAARTX^QTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNAROFMSWLODVDDKFPKLKTCLiMRQ 
OHEAAALNAVORLEWQLKLOELDPATYXS I S I YEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQENNA LDMA PE I HMTG PMCL I ENTNGE L VAN PEALK 1 LS A I 
TQPWWAIVGLYRTGKSYLr-INlCLAGKNKGFSLGSTVKSHTKGI 
W^WCVPHPKKPEHrL\n^LDTEGLGDVKKGDNC«DSWIFTLAVLL 
SSTL\TiNSKX5TIN(^AMIX?LYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGOPLTPDEYLEYSLKLTOGT 
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BNSDOOD: <WO 015331 2A1J_> 



WO 01/533 J 2 



PCT/US0W34263 



SEQ 
ID 
NO: 


Pi ecictec 
beginning 
nucJ cot iue 
location 
corresponding 
tc first 
srr.ino acid 
residue of 
am: no acid 
sequence 


Fredictec end 
nucleotide 
location ! 
corresponding 
to first 
amino ecic 
resioue o> 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(/^Alanine, C=Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, F=Pbenylalan:ne, G=Glycine, 
H=Histidine, 1-lsoleucme, K=Lysine. 
L-Leucine, M=Metbionine , N=Asp«iragine , 
P=Proline, 0=Glutamine, R-Arginine, 

W=Tryptophan, Y-Tyrosine, X=Unknovm, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SOKDKNhNLPRLCJ K KFFPKKKCFVFDLP I HRRKLAQLEKLQDE 
ELDPEFVQOVAJDFCSY I FSNS KTKTLSGG I K VNGPRLESLVLTY 
1 N A I SRGDLPCKENA VLALAQ I ENSAAVQKA I AHYDQQMGQKVQ 
LPAETLQELLDI.HRVSEREATEVYMKNSFKDVDHLFQKKLAAOli 
DKKRDDFCKQNCEASSDRCSALLQVIFSPLEEEVKAGIYSXPGG 
Y CbF 1 QKLQDLEKKY Y EEPRKG1 QAEE 3 LQTYLKSKES VTDAI b 
OTDQI LTEKEKEIEVECVKAESAQASAKMVEEMQI KYQQMMEEK 
EKSYCEHVXOLTEKMERERAObLEECEKTbTSKLOEOARVLKER 
COG ESTQLQNE J QKbQKTLKKKTKSYMSHKbKI 


6265 


143 


1960 


^ROENmbDMAPElHMTGPMCLIENTNGrbVANPEJOjKlLSAI 
TQPVWVAIVGLYRTGKSYLMNKI»AGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVbbDTEGLGDVKKGDNONDSWIFTLAVLL 
SS7LVYNSMGTI NOOAMDQLYY VTELTHR I RSKSSPDENENEDS 
ADFVSFFPDFVWTbRDFSLDLEADGQPLTPDEYLEYSLKbTQGT 
SQKDKNFNLPRLCIR KFFPKKKCFVFDLP I HRRKLAQLEKLQDE 
ELDPEFVQQVADFCSY J FSNSK7KTLSGG2 K VNGPRLESLVLTY 
3 m 1 SRGDLPCMENAVLAIAQl ENS AAVQKA I AHYDQQMGQKVQ/ 
I,PAETLQELLDUJRV£EREATEVYMKNSFKDVDHLF0KKbAAOb 
DKKR DDFCKQNQEASSDR C5ALLQV2 FS PLEEEVKAGI YSKPGG 
YCLFI QKLQDLEKKYY EEPRKG i QAEE I LQTYLKSKES VTDAI L 
0TDOILITKEKEIEVECVKAESAOASAKMVEEMOIKYCX3MMEEK 
EKSYQEHVKOLTEKMERERAOLLEEQEKTLTSKLQEOARVLKER 
COGESTQL0NE1QKL0KTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQG 1 GLDG J PEVTASE 
GFTVNEINKKSIH3SCPKENASSKFLAPYTTFSRIHTKS1TCLD 
I S S RGGLGV S S STDG TMK I WQ AS NG ELR R V L EGHVFD VN CCRF F 
PSGLWLSGGMDA0LKIWSAEDASCWTFKGHKGG1LDTAIVDR 
GRNWSASRDGTARLWDCGRSACLGVLADCGSSINGVAVGAADN 
S 1 H LGS PEQMPSEREVGTE AKJiLLU^EDKK LQCLGLOSROLiV F 
LFIGSDAFNCCTFLSGFLLLAGTODGNIYQLDVRSPRAPVQVIH 
RSGA PVL5LLS VRJDGFI ASQGDGSCFI VQODLDYVTELTGADCD 
PVTKVATWEKQ1 YTCCRDGLVRRYQLSDL 


~6267 


3 


622 


LGMMKKNNSAKRGPODGNQOPAPPEKVGMVRKFCGKGIFREIWK 
NRYWLKGDQLY1SEKEVKDEKNIOEVFDLSDYEKCEELRKSKS 

ITRAKNRILDEVTVEEDSYLAHPTRDRAK10HSRRPPTRGHLWA 
VASTSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLSSAL1 DNPLTLLLS I DTYVMLQEPVTFQDVAV 
DFSREEWGLLGPTQRTEYRDVKLETFGHLVSVGWBTTLENKELA 
PNSDIPEEEPAPSLKV0ESSRDCA1.SSTLEDTLOGGVOEVQDTV 
bKCMESAQEKDLPQKKHFDNRESOANSGALDTNOVSLQKIDNPE 
SOANSGALDTNQVLLHKI PPRKRLR KKDSQVKSMKHNSRVKIHQ 
KSCEROKAKEGNGCRKTFSRSTKOITFIRIHKGSQVCRCSECGK 
I FRNPRYFSVH KK1 HTGER P YVCQDCG KGFVOSSSLTOHQRVHS 
GERPFECQECGRTFNDRSAISOKLRTHTGAKPYKCQDCGKAFRQ 
SSHL1RHQRTHTGERPYACNKCGKAFT0SSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


28B6 


1445 


HA S APTRRKMAAAS PLR DCHAW KD AR LP LSTTSN EACKX.FDATL 
T0YVKVTTNDKSLGG1EGCLSKLKAAI)PTFVMGKAWATGLVLIGT 
GSSVKLDKELDLAVKTMVE I SRT0PLTRREOLH VSAVETFANGN 
FPKACELWEOILQDHPTDMLALKFSHDAYFYLGYQEOMRDSVAR 
I YP FWTPDI PLSSYVKGl YSFGIJ^ETNFYDOAEKLAKEAI/SINP 
TDAWSVHTVAH IHEMKAE IKI>GLEFMOHSETl,WKBSDMLACHNY 
WKWALYLIEKGEYEAALTIYPrHlLPSWAKT>AMLDVVDSCSML 
Y R L<?MEG VS VGQR WQDVLP VAR KHSRDH I LLFNDAHFLMASLGA 
HDPOTTQELLTTLRJ)ASESPGENCC>HLIJ^RDVGLPLCOALVEAE 
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WO 01/53312 
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SEC 
3D 

NO: 


i- redacted 

beginning 

nucleotide 

location 

cor re spend i ng 

to first 

amino acid 

residue ol 

smino acid 

sequence 


Predicted end 
rjucl eoticU 
location 
corresponding 
tc first 
amino acic 
residue oi 
amino acid 
seguerjet- 


Amino acid segment containing signal peptace 
{A^Alanine, C=Cysteme, D=Aspartic Acid, E- 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Kistidine, 1= I scieucine , K=Lysine. 
L-Leucine, Methionine, N^Asparagine , 
P=Proline ( Q=Glotanune. R=Arginir.e, 
S=Serine, T= Threonine , V^Valine, 
W=Tryptophan, Y=Tyroeine, X*Un)cnown, *-Stop 
Codon, /=possibie nucleotide deletion, 
\^pcssible nucleotide insertion) ; 








DGNPDRVLELlXp I R YR 1 V QLGGSNAQR DV FNQL L I HAALNCTS | 
SVHKNVARSLLKERDALKFNSPWERURKAATVHU4Q ] 


6270 


23 


2C66 


SV1VTLGSEGDGRPPTYHLEEMEQEPQNGEPAEIKI IREAYKKA 
FLPVNKGLNTDELG0KEEAKNYYK0G1GHLLRGISI SSKESEHT 
G?GWESAR0M0C'KMKETI J CK\T?TRLEILEKGLATS1>0NDLQEVP 
K L Y P E F P P KDKC E Kb P E P 0 S FS S A PQ KAE VNGNTS T P S AG AV AA 
PASL>SbPSQSCPAEAP?AYTPQAAEGHY7VSYGTDSGEFSSVGE 
E F YRNHSQP PPLETLG LDADEL 1 L I PNGVQ 1 FFVN PAG EVSA PS 
YPGYLRIVKFLDNSLDTVLKRPPGFLOVCDWLYPLVPDRSPVLK 
CTAGAYMKPDTMLOAAGCFVGWLSSELPEDDRELFEDLLROMS 
DLRLQANWNRAEEENEFQIPGR7RPSSDQLKEASGTDVKQLDQG 
N KD VR HKG KRG KRAKDTS S E EVN LSH 1 VP CEPVP EE K P KELPEW 
SEKVAHN1LSGASKVSVJGLVKGAEITGKA1QKGASKLRERI0PE 
EK P VEVS PAVTKGLY I AKO ATGGAAKVSQFIiVDG VCTVANCVG K 
EUAPKVKKHGSKLVPE5LKKDKDGKSPLDGAMVVAASSVQGFST 
VWQGLECAAK C J VNNVS AETVQT VR Y K Y G YNAG EATHHA VDS A V 
NVGVTAYNJNNIGlKAMVKKTATOTGHTLbEDYQIVDNSORENO 
EG AANVNVRG E KDEQT K E VKE AK XXDK 


6271 


32 




GCGVKTAGMVGREKELS1 HFVPGSCRLVEEEVN1 PNRRVLVTGA 
TGLLGRAVHKEF00N1W11AVGCGFRRARPKFE0VNLLDSNAVHH 
1 3 HD FQPHV I VHCAAE R R PD WENQPDAASCLNVB ASGN LAK EA 
AAVGAFLIYISSDYVFDG'I-NPPYREEDIPAPLNLYGKTICLDGKK 
AVLENNLGAAVX.RI P I LYGEVEKLEESAVT VMr DKVQFSNKSAN 
MDH WOOR FPTH VKDVAT VCRQLA E KRMLDPS I KGTFHWS GNEQM 
TK Y EMACA T A DA FNLPS $ H LR PI TDS PVLGAQR PRH AO LDCS Kb 
ETLG I GORT P FR 1 G I K E S LW P Fb 1 DKRWRQTV FH 


6272 


1136 


526 '" 


GAV>lEDAAAPGRTEGVLEROGAPPAAG0GGALVELTPTPGGiAL ; 
VSPYHTHRAGDPLDLVALAEQVOKADSFIRANATNKLTVIAEQI 
OHLQEOARKVLEDAHRDA^LHHVACN I VKKPGN I Y YL Y KR ESGQ 
OYFSIlSPKEWGTSCPHDFLGAYKbOHDLSWTPYEDIEKODAKI 
SMMDTLLSQSVALPPCTEPNFOGbTH 


6273 


256 


84 j 


SCPRV^PECRSLGCOVMFSLPJ-NCSPDHIRRGSCWGRPQDLKIA , 
SAAWNSKCHPGAGAAMARQHAJ^TLWYDRPRYVFMEFCVEDSTDV 1 
HVLI EDHR 3 VFS CKNADGVELYNE 3 E FYAKWSKDSQDKRS SRS 
ITCFVRKWKEKVAWPRLTKEDIKPVWLSVDFDKWRDWEGDEEME ■ 
LAHVEHYAEVRDNTYCVLPT 1 


6274 


56 


1142 


AAAAMAAAAGGGAGAAR SLSR FRG CLAG ALLGDCVG S F Y EAKDT 
VDLTS VLRHVC S 1 > E P D PG T PG S ERTE A1»Y YTDDT AMARAL VQSL 
LAKE AFDE VDMAHR FAOE Y KiGD PDRG YGAG WTVFKKLLNPKCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYSSVQDVQKFARL5 
AQLTHASSl^YKGAlZ^AJ^VHbAiOGESSSKHFLXOLLGHMED 
LEGDAOSVLDAREbGMEERPYSSRLKKIGEliLDOASVTREEWS 
ELGNGIAAFESVPTAIYCFbRCMEPDPEIPSAFNSLQRTIjIYSI 
SLGGDTDT3ATMAGAIAGAYYGMDQVPESW0QSCEGYEETDILA 
QSLHRVFQKS ] 


6275 


20 




srrgrarclargsrRpvfrpaktmafmvktmvggqlknltgslg 

GGEDKGDGDKSAAEAQGMSREEYEEYOKQLVEEKMERDAQFTQR 
KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI J 
EEDTEEEEEKASVLG0LASLPGLNLGSLKDKAOATLGDLKOSAE 
KCHVM 


6276 


797 


97 


TLLPLPPLPDTEGMILLNTGLEGTVAENPVP1VHTPSGNILTLE 
SCLQQLATHPGKWG 3 H LQ1 AEPAALRPSLAIXARLSSLG LLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSG YREQLLTDMLELCOGLW 0 PV S PQMQAMLLGH STAG A I GR L 
liASSPRATVTVEHNPAGGDYASVRTALLAARAVDRTRVYYRLPO 
GYHKDLLAHVGRN 
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SEQ 

ir 

NO: 


Predi ctec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
arr.ino acic 
residue ci 
amino acic 
sequence 


Ammo acid segment ccntaininc signal peptiop 
(A=Alanine, C-Cyeteine, D=Aspartic Acid, E* 
Glutamic Acic, F= Phenylalanine, G=Glycine, 
H=Histidinc, 1 = 1 sol cue ine , K=Lysine, 
L-Leucine, M=Kcthionine, N= Asparagine , 
P=Proline, Q-Glutanune, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X»Unknown, *-^Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


6 2 77 


4600 


2 74 4 


ytAFRTEMGLYYSyFKTIVEAPSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPEVILASWYRIYTKIMDL1GI0TKICWTVTIGEG!»SF 
TESCEGUGDPACFYVAV1 FJ LNGLMMALFF1 YGTYLSGSRLGGL 
VT VLCF FFNHG E CTR VMW TP PLR ES FS Y PFLVLQMLLVTH I LR A 
TKJLYRGSL3ALCISNVFFMLPW0FAQFVLLT0IASLFAVYWGY 
IDICKLRKI1YIHM1SIALCP/LKFGNSMLLTSYYASSLVIIWG 
1LAMKPHFLKI NVSELSLWVIQGCFWLFGTVILKYLTSKI FGI A 
NDAHIGNLLTSK?FSYKDFDTLLYTCAAEFDFMEK2TFLRYTKT 
LLLP WLVG F\^A I VRK 1 1 S DMWGVLAKQ0THVRKHOFDHGE L VY 
HALQLLAYTALGILlMRLKbrbTPKMCVMASblCSRQLFGWLFC 
KVH PGA I VFAI LAAMS 1 OG S AMLQTQWN I VGEFSM^PQEEL I EW 
1 K YSTK PDAVFAGAMPTMAS VKI^SALR P I VNHPHYEDAGLRART 
KIVYSMYSRKAAEEViO^ELIKLKVNYYILEESWCVRRSKPGCSM 
PEIWDVEDPANAGKTPlCNH,VKDSKPHFTTVF0NSVYKVLEVx7 
KE 


6278 


3 


823 


IliFRLVLLSLVYLLNSVATEERKPAEVLIVEGOQYAWGTVLbL 
I RI I LE YCOGVDNI ?S VTTDKl/rRLSDLLK YFNSKS CQLVLGAG 
AIJQWGLKT ITT XH1ALS SR CLQL1 VII Y I PV I RAHFE ARLP P KQ 
YSMLRHFDHITKDVHDHIAEISAKLVAIMDSLFDKLLSKYEVKA 
PVPSACFRNICKOMTKMHEAIFDLLPEEOTQMLFLRINASYKLK 
L KKQLS HLNVJNDGGP QNG 1, V T ADV A F Y TGN LQ AL KG LKDLDLN 
MAEIWEQXR 


6279 


12-/ 


1687 


GGAMAS DGAR KO FWK3SNS K L PGS 1 QHVYGAQH P P FDPLLKGT L 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 
AESLKSEWMETANRVLRNHSUKQGRPTLOEGPGL00KPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPKSAT 
VTLGGTSDPSTI.SSSALSEREASRLDKFKOLIjAG PNTDLEELRR 
LSWSGIPKPVRPMTWKLLSGYLPANVDRRrATL0RK0KSYFAF2 
EHYYDSRJ3DEVH0DTYRQ1HID1PRMSPEALILOPKVTE1FER: 
LFIWA1RHPASGYVQGINDLVTPFFVVFICEYIEAEEVDTVDVS 
GVPAE VLCN I EADTYW CMS KLLDG I QDN YTFAQPG J OMKVKMLE 
E LVSR 1 DEQVHRKLDQHE VR Y LQFAFRWMNJJLLMKE VPLRCT I R 

lwptyosepdgfshfhlyvca;>flvkwrkeileekdfoelllfl 

ONLPTAHWDDEDISLUAEAYHLKFAFADAFNHYKK 


6280 


857 


251b 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAC VLAYU.RRGQVR LVQCCGAANLQFl QALLDSEEENDRA 
WDGRUGDRYNPPVDAT P DTRELE FNE1 KTQYELATGQLGLRRAA 
QKHSFPRNLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YSKDGQI FMS ACC-DQTI RLYDCR YGRFRKFKS I K A 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIH1 CN I YGRGDTHTALD 
LRPDERR F AV FS I AVS S DG R EVLGG ANDG CL YV FDRE QNRRTLC 
IESHEDDVNAVAFAD1SS0ILFSGGDDAICKVNDRRTMREDDPK 
PVGALAGH0DGITF1DSKGDARYLTSNSKD0TIKLWD1RRFSSK 
EGMEASROAATQONWDY R WQQV P KKAWRKLKLPGDS S LMT YRGH 
G VLHTLI RCR FS P 1 HS TGQQ FI Y S G CSTGKVWYDLLSGH I VK K 
LTHHKACVRDVSWHPFEEKIVSSSWDGNLRLWOYRQAEYFODDM 
PBSEECASAPAPVP0SSTPFSSPO 


6281 


85-7 


251S 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE 
DVDIAQVLA YLLR RGQVR L VQGGG AANLQFIQAL LDS EEENPRA 
WDGRLGDRYNPPVDATPDTRSLEFNEIKTQVBLATGOLGLRRAA 
QKHSFPRWLHORERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YSXDGQI FMSACQDQT 1 RLYDCRYGRFRXFKS I KA 
RDVGWSVLDVAFTPDGmiFLYSSWSDYIHICKlYGEGI>THTAi,D 
LRPDERR FAVFS I AVSS DG R E VLGGANDGCLYVFDREWRRTLQ 
I ESHEDDVNAVAFADTSSQ I LFSGGDDAI CKVWDRRTMREDDP K 
PVGALAGHQDGI TFI DSKGDAR V LI SNS KDQTI KJL^IRRFSS R 
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BNSDOCID: «-WO_._0153312A1„l_s 
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SEQ 
NO: 


Prerii ctec 
bega nninc 
nucleotide 
location 
correspond ina 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
am: no acid 
residue of 
anu no acid 
sequence 


Ammo acid gtoment containing signal peptide 
!?i=Wamne, C-Cysteine, D-A^partic Acid, Est 
Glutamic Acid, F= Phenyl alanine, G=Glycmt ( 
H=Histidme, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, C-=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosdne, X- Unknown , **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASROAATO0NWDYRWOQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTL1RCRFSPIHSTGO0FI YSGCSTGKVWYDLLSGH1 VKK 
LTN H KACVR D VS WH PF E E K I VSSS WDGNLRL WQ YRQAE Y FQDDM 
PESEBCASAPAPVPQSSTPFSSPC; 


6282 


12b 


906 


R^AACRALKAVLVDLSGTIjHI edaavpgaqealkrlrgasvi ir 
FVTNTTKESKQDLLBRURKLEFDISEDEIFTSIjTAARSLLEHKQ 
VRPMLLVDDRALPDFKGIQTSDPHAWMGLAPEHFHYOILNOAF 
RLT.LDGAPLI AI HKARY YKRKDG1ALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAODVGML. 
G3 LVXTGKYRASDEEK1NPPPYLTCESFPFAVDHI LOHLL 


6283 


140 


1043 


LSLFGIHVMNPFVISMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPSIPKNALPITKPTSPAPAAOSTNGTHASYGPFYLEYSLL 
AEFTLWKOKLPGVYVQPSYRSALMWFGV1FIRHGLYODGVFKF 
TVY Z PDNYPDGDCPRLVFDI PVFHPbVDPTSGELDVKRAFAKWR 
RNKNH I WQVLM YARRVFYK I DTAS PLNPEAAVL YEKPI OLFKSK 
WDS VK VCTAR LFDQPK 3 EDPYAI S FS PWNPS VHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


a 


2879 


RSV1 PGST1 SSRWPGLSRPRFMAAHEWDWFOREELTGOJ SDI RV 
ONL0VERENVOKRTFTRWI NLHLEKCNPPLEVKDLFVD1 QDGKI 
L^LLEVLSGRNI.LHEYKSSSHRIFRIiNfNlAKALKFLEDSNVJa- 
VSI DAAEI ADGNPSLVLGL I WNII LFFQ1 KELTGNLSRNS P.SSS 
LAPGSGGTDSDSS FPPTPTAERSVA1 SVKDQR KAI KALLAWVQR 
KTRKYGVAVODFAGSWRSGIAFLAVI KA 1 DPSLVDMKCALENST 
RFNLEKAFSTAODALHIPRLLEPEDIMVDTPDEQSIMTYVAQFL 
ERFPELEAED3 FDSDKEVP I ESTFVRIKETPSEQF.SKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRLDGVSSHALSDS 
STEFMHQ1 1 DQVL0GGPGKTSDISEPSPESS1LSSRKENGRSNS 
LPI KKTVHFEADTYKDPFCS KNLSbCFEGSPRVAKESLRODGHV 
LAVEVAEEKEOKQESSK1 PESSSDKVAGDI FLVEGTNNNSQSSS 
CNG ALE STARHDE ES HSLS P PGENTVMADS FO I KVNLMTVEALE 
EGDY FEAI PLKAS KFNSDL I DFASTSQAFNKVPS PHETKPDEDA 
EAFF^AEKLGXRSIKSAHXKXDSPEPOVKMDKHEPHODSGEEA 
EGCFSAPEETFVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFXPSPPLSKVSVI PHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
BEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
LAPKEDHOORETKENDPMbSHOSOESPNLENIANPLEENVTXES 
I£SKKKEI«KHVDHVESSLFVAPGSVQSSDDLEEDSSDYS1PSR 
TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSRSG 


6285> 


2157 


1331 


SCKTENLLSKWWFQOGLSFLPSALVI WTSAAF1 FS YI TAVTLHH 
IDPALPYISDTGTVAPEKCLFGAMLNIAAVLCIAT1YVRYK0VH 
ALSPEENVI I KLNKAGLVLGJLSCLGhS2VA2^FQKTTLFMHVS 
GAVLTFGMGSLYMFVQT1LSY0MQPKIHGK0VFWIRLLLVIWCG 
V<? Al.SML.Tr^^VT,^ 9GMFGTDT .FOKT.HWKPFDKGYVLHM1 TTAA 
E W SMS FS FFG FFLT Y I R DFQK I S LRV EAN LHGLTLYDTAPCP IN 
NERTRLLSRD1 


£286 


1619 


276 


kagasccgsanpytvsvgkscvllamaqlqtrfytdnkkyavddv 
pfsi paasei adlsni inklbkdknefhkhvefdfli kgqflrm 
pldkhmemeni sseeweiey vekytapqpeocmfhddwi ss ik 
gaeew1 ltgs ydktsr i wslegks i mt i vghtd wxpva w vkkd 
sls clllsasmd0t3 llwewnvernkv kalhccrghags vds i a 
vdgsgtkfcsgswdkmlx1wstvptdeedemeestnrprkkqkt 
eolgltrtpi vtlsghmeavssvlwsdaeei csaswdhti rvwd 
vesgs lkstltgn kvfnc i sys plcxr lasgs tdrh ir lwdprt 
kdgslvslsltshtgwtsvkwsptheoqlisgsldnivklwdt 
rsckaplydlaahedkvbsvdwtdtglllsggadnklysyrysp 
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SEC? 

ID 
NO: 


Pi edi ctec 
beginninc 
nticlectiot 
location 
corresponding 
to first 
amino acid 
residue of 
snuno acid 
sequence 


Predicted end | 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acio segment containing £;onal y^eptide 
tA-Alanme, C=Cysteine, P^Aspn m c Acid, E= 
Glutamic Acid, F=Phenylalsnine, C-=Glycine, 
K=Histidine, } = 1 soleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
?=Proline, Q=Glutomine, R=Arginine, 
S^serine, T=Threonine, V=Volin<_, 
W^Tryptophan, Y^Tyrosine, X=Un.known, *=Stop 
CocoH/ /^possible nucleotide deletion, 
^possible nucleotide inserticn) 


i 






TTSHVGA 


6287 


278 


1482 


M0FFFNF01GLRS?SGK£KySGDAGFbGDALCLFLQCLALDEDF 
APAKLQVOKIUCDU.I.PENLKEGLKESSWSSl.PCTKT^RPFDFHS 
VMEESQSLNSPSPKOSEEIPEVTSEPVKGSLNRAQSACSINSTE 
MPAREOCbKRVSSEPVLSVQEKGVLLKRKLST;LEQDVlVKEDGR 
MKbKKOGETPMEVCMFSLAYGDIPEEblDVSDFECSLCMRLFFE 
PVTTPCGHS FCKHCLERCLDHAPYCPLCKF.S LKEYLADRR YCVT 
OLLEELl VXY LPDELS ER KK 3 Y DEBT AELSK LT KJJV P 1 F VCTttA 
YPTVPCPl>HVFEPRYRbMlRRSIQTGTKQ?GMCVSDTONSFADY 
GCML01 RTJVHFhPDGRSWDTVGGKRFRVLKRGMKDGY CTADI E 
VLEDV 


6288 




743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDJLAICLGAECLR 
MLDSGADYLHLDVMPGHFVPNITFGHPVVESLRKGLGQPPFFPM 
HMMVSKPEOWVKPMAVAGANOYTrHLEATENPGALI KOIRENGM • 
KVGLAI KPGTSVEYLAPWANQI DMAL VMT V E PG FGGO K FMEPMM 
PK VHWLR TO FPSLD I E VDGGVGPDTVH KCAEAGANM J VSGSAJM 
RSEDPRSV3NLLRKVCSEAAQKRSLDR 


628S 




743 


VTLYPCRCLVGNLLLG ASGMASGCK1 GPS I LNSDLAN1 GAECLR 
MLPSGADYJLHLDVMDGHFVPKITFGHFWE^LRKOLGOUPFFDM 
1 IMMVS KPEOWV XPMAVAGANQYTFHLEATEN PGAL1 KD 3 R FNGM 
KVGLAI K PG TS VEY LA P W ANQI PMAL VMT V KPG FGGO K FMEDKM 
F KVHWI.RTQFPSLD3 EVDGGVGPDTVHKCAEmGAMKI VSGSA1M 
R S ED PRSV 1 N LLRNVCS E AAQKRS LDR 


6290 


3 


t lBSfc 


TUJRWLLGVYETVAPTLACLPRPRLRRftRRUR RR RM1 £ R YTRKA 
VPOSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
D1TRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 
CK ATEKVQTMFTAT DELLY EQKLSVHTKSLOE ECQQWTA SFPHL 
R1LGR0IITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFG1R 
GKXLHFSSSYAHKASS 3 AKSSSFCSMERDEEDSI IVSEG1 IEEY 
1J,FDH1D1EEGFHGKKSEAATEK0KLGYPP3APFYCMKEDVLAY 
VFDSWCKVVSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 
L SELHPLVLPRV PQSK VLY I TSK PMSLCWSKHQPNVNPLLVHG 
?iPLOPRNLSLMDKLLDLDDKLLMRPGSST3LSTKNWPNRAVEFS 
TSSLSYTVQSTRRRNPPFFTLHPISTSHSCAETPRSVEEILRGA 
RVPVAPDSLSSPSPTPLSRlINLLPPIGTAEVEJrVSTVGPOROMK 
PHGPS SRAQS AWDEPN Y QOPQERLLLPPFFPK PNTTOS FLLDT 
0YRRSCAVEYPHQARPGRGSAGPQLHGSTKSQSGGRPVSR7RQG 
F 


6291 


1732 


602 


LVA KMAS S AS ARTPAGKR V INQEELRRLMKEKORLSTSR KR I ES 
PFAKYNRLGQLSCALCNTPVKSELLWQTHV1/?K0HREKVAELXG 
7sK EASOGS S ASSAPQS VKR KAPDADDODVKRAKATLVPOVQPS T 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEPEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDOMDKEWDEFCKAMRQVNTISEAIVAEEDEEGRLDRQ 
J 6EIDE0I ECYRRVEKLRNRQDEI KNKLKEILTl KELOKKEEEN 
AP S DDEG E LQDLLSQDWRVKG ALL 


6292 


1835 


1142 


TCPGAMKMVAPViTRF YSNSCCLCCHVRTGTI LLGVWYL1 1 NAW 
LLI LLSAliADPDQYWFSSSELG<5DFEFMPDANI4CIAI AI SLLMI 
LI CAMATYGAYKQRAAWI J PFFCYQ1 FPFALNMLVAI TVL1 Y PN 
SI0EY1R0LPPNPPYRDDVNSVNPTCLVL1 I LLF1 S I 1 LTFKGY 
LI SCVVmCTRYINGRNSSDVLVYVTSNPTTVLLPPYPDATVNGA 
AKEPPPPYVSA 


6293 


2382 j 1035 

i 


FWCTLGTVPVHPIGWCAlNSKILVPPRTIHAKFTPWKGYLMKRb 
VG S RTLPVPFH 1 KM V ESMKYPFRQGMRLE WPKSOVS RTRMAVV 
PTVIGGRLRLLYEDGPSPDPFWCHMWSPLIHPVGWSRRVGHGIK 
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SEO 
ID 
NO: 


f-reciict.ee 
beginning 
nuc'i eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted er.c 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^AJanine, OCysteine, p=Aspart jc Acid, E= 
Glutamic Acid, F^Phenyi al snine, G-Glycine, 
K-Histidine, I = lsoieucine, K=Lysi ne, 
L^Leucine, M= Methionine , N-Asparagine, 
P-Proline/ Q-Glutamine, R-^Arginine , 
S-Serine, T=?hreonine, V- Valine, 
W=Tryptophan, Y=Tyrosine, X-Unkncwn, *=Stop 
Codon, Apocciblc nucleotide deletion, 
\=poosible nucleotide insertion) 








MSERRSDMAKHPTrRKlyCDAVPVLFKKVRAVYTEGGWFEEGKK 
LEAlDPLNLGNKVATVCKVLLDGYLMl CVDGGPSTDGLDWFCY 
}1ASSHA3 FPATFC0KND3 ELTPPKGYEAQTFNKENYLEKTKSKA 
APSRLFNMDCPNHGFKVGMKLEAVDLKEPRL 1 CVATVKRWHRL 
LSIHFDGWDSEYD0WVDCESPD1YPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRK^IPPTKTRFLROGSKKPLLEDD 
PQGARKISSEPVPGEI1AVRVKEEHLBVASPDKASSPELPVSVE 
NIKQETDD 




354 


1814 


A01.TTRGRT VAGG V K Wl P S P F P DLELY S CCLGTDRCFPELSKHC 
KI^V I ATAS DYDMAE I TNI R PS FDVS P WAGLI GASVL WCVSVT 
VFWJSCCH00AEKKHKNPFYKF1 HMLKGlSI YPETLSNKKKI 1 K 
VRRDKDGPGREGGRRNLLrVDAAEAGLLSRDKDFRGPSSGSClDO 
LP1XMDYGEELRSP1TSLTPGESKTTSPSSP2EDVMLGSLTFSV 
DYNFPKKALWT10F.AHGLPVMDDQT0GSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGlPYSOLODbVLHFLVLSFDRFS 
RDDVIGEVMVPbAGVDPS'lGKVOLTRDl I KRNIQKCISR'GELQV 
S LS YQPVAORMT WL KAR H LQ KMD I AG LSGN P YVKVNVYYGR K 
R 1 AKKKTHVKKCTLNPI FNESFI YDI PTDLLPD3 S1EFLVIDFD 
R TTKNE WGRLI LG AHS VTASGA EH WR EV CES ?R K PVA KWHSLS 
EY 


6295 




617 


VSSALLTGATSGSEAAKS ESASAS PLSCTNAVAMDRPDEGPPAK 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPOQRPRLQEETEAA 
OVIiADMRGVGLG P ALPP P PFYV1 LEECSG I RAY FTLGAJECPGWDS 
T1ESGYGEAPPPTESLEALPTPEASGGSLEIDFQWCSSSFGGE 
G ALETCSAVGWAPQRLVD? KS KEEAI 1 1 VEDEDEDERESMRSSR 
RRRRRRRRKQRKV Kit ES RERWAER MES 3 LQALED3 QLDLEAVN3 
KAGXAFLRLKRKF1 OMRRPFLERH DM 1 UH3 PGFWVXAFLNHPR 
1 S 1 L2 NRRDED1 FRYLTNLQVCDLRH 1SMGYKMKLYFQTNPYFT 
NMV 1 VKEFQRNRSCRLVSH STP 3 RWKRGOEPOARRHGNODASHS 
FFSWFSNHSLPEADR I AE1 3 KNDLWVNPLRYYLR ERGSRI KRKK 
0EMKKRKTRGRCEW1MEDAPDYYAVEDIFSEISDIDET1HDIK 
IvSDFMETTDYFETTPNEITDI NEN3 CDSENPDHNEVPNNETTDN 
NES ADDHErTDNNESADDNHEN F ECNNKNTDDNEENPNIWENTY 
GNNFFKGGFWGSHGNNCDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRD1 EYYEKVI EDFDKDOADYEDVIE3 1 
SDESVEEEGIEEG30QDEDI YEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWAN PGKRG KTG 


62 9& 


72? 


1199 


RHCGCDAQGACDSLPFTGTSSPVTARNA1PEARCCVWLLDGTTV 
EAVRPARERLARKELROKRMQOFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGT1 SFGSFSDSGI FPLGSQCCLGFQQFSISGX 
KWALIHKRVRLSVFGARWGRI YFGK 


6297 


1 


92; 


QRAAAAS PSSCGPRG AEYGALMAMEGY WRFLALLGS ALLVGFLS 
VI FALVWVLHYRBGLGWDGSALEFNWHPVLMVTGFVFICX5IAI 3 
VYRLPWTWKCSKLLMKSIHAGLNAVAAI LAI 3 SWAVFENHNVN 
»3AWYSLHSWVGl/2AV3CYLbOLl>SGFSVFLLPWAPLSLRAFl J 
MP1 HVYSGI VI FGTV I ATALMGLTEKLI FSLRDP AYSTFPPEGV 
FVNTLGLLI LVFGAL 3 FW3 VTR PgWKRP KEPNST3 LHPNGGTEQ 
GARGSMPAYSGWNMDKSDSELKNEVAARKRKLjALDEAGQRSTM 


6298 


3 


9Bb 


SVPLRRLSI^GTLQGAGTTTKMAVARI^AVAAWVPCRSWGWAAV 
P FG PHRGLS VLLAR I PQRA P R WLPACRQ KTS LS FLNR PDLPNLA 
YK1CLKGKSPGIIF1FG YLSYMNGTKALA.t EEFCKSLGHACIRFD 
YSGVGSSDGNSEESTLGKWRKDVLS3 3DDLADGPC3 LVGSSLGG 
WLMLHAAIARPEKWALIGVATAADaXVTKroOLPVBLKKBVEM 
KG VWSMPS KY SEEGV YNVQY S FI KEAEHHCLLH S P 1 PVNCPIRL 
LHGMKDDI VPWHTSMQVADR VXSTDVDV 3 LRKHSDHRMREKAD3 
OLLVYT3DDLIDIOST 3 VN 
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SEC 
IC 
NO: 


Pi eci ci r-d 
bee i nr. i no 
nucleot ide 
locat i cri 
corresponding 
to firsL 
amino acid 
residue of 
amino acid 
sequence 


Pri'dictec crxl i 
: ucleotic' 
j ocat icr. 
corresponding 
to first 
o-rnino ac-t 
residue o: 
smino aciri 
$ ccuence 


nr.uio ?.ciQ se-cment containing signal peptide 
(A^Alar.ine, C=Cysteine, D=Aspartic Acid, E= 
Ciutomic Acid, F= Phenyl a) anine , G^Glycine, 
K=Hist idir.e , l=lsoleucine, K-= Lysine, 
L=Leucine, M=Metnionine, N=Asparagine, 
P ~ Prol 1 ne , Q=Gl ut amine , R=Aro;imne, 
S -Serine, T--Threonine, V^Valint , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Cction, /^possible nucleotide deletion, 
\-possible nucleotide insertion; 


62SS 


Si 1 




rrni tr-/~>"T r> vn / <r» T c T ci r4T»W»C f)T /"itit T t ru Ti/^tf tct ricx t 1 «r»o c»~ 

FCDhEGIKPNV i J.bi-»SLP i NOi VLAjlJl LVHPt. VI SLDSAI JLTSS 
S 3 DAMDESAFSGPyKFPFTPPLESFNLCFYTSOVPVPPI LGFYQ 
MKF.EEVQLRNNH 


6 300 


12 J 


69, 


AA PS CWSCRGVPAAGT PS S PR LLVSKAAAPS AG ?WGAWR<?GARA 
AOSPPSTPNSSSVPYGSODSVHSSPSDGGGGKDRPVGGSPGGPR 
LVlGSLPAKLSPHMFGGFKCPVCSXFVSSDEMDbHLVMCbTKPR 
J TYNEDVLS KDAGECA3 CLEELQOGPTI AKLPCLCI YHXGC1 DE 
WFLVNRSCPEHPSD 


6301 


61( 


264 


GKKVPVNWEPPQPLFFPKYLRCYRCLLETKELGCLU5SDICLTP « 
AGS SC I TLH KKNSSGSDVMVSDCR S KSQMS DCS NTRTS PVSGFW 
J FSQYCFLDFCtfDPQNRGLYTP 


6302 






J FGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVALHSDPCLLS 
PVLl^CLPGDLRPLDELY/iOKLKYKAlSEELDHALNDMTSL 


6303 






yV^EYGGGLLWOSWQEXHPGOALSSEPWNFPDTKEEWEOHYSOL 
YKYYLEQFCYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 

KVDbVSFLSSPIMGDNDSSGTSDKBHSEJLDGISNIKLNSEEVT 
OSOLDSCTSHDGHQOLSEVSSKRECPASGOSEPRNGGTNEESNS 
SGKTWTDPPAEDSOKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELDIDENPASDFDDSGSLLGFKYGSG0KYGG3PNFS 
HWVRYIiEKNVKLKSKYLDMRROIKMKNKHIFFTKESEKPFFKK 
SKUiSKVEKFLTWVNKPKDEEASOESSSHDNOlDASTSCDSEEQ 
DMiVKKGDDLLETNNPEPEKCOSVSSAGEI>ETENYERDSI>bATV 
PDEODCVTOEVPDSRQAETEAEVKKKKKKKXNKKVNGLPPEIAA 
VPEbAKYWAORYRLFSRFDDGlKLDREGWFSVTPEKIAEHlAGR 
VSOS FKCDVWDAFCGVGGNTIQFALTGMRV1 AI DI DPVK3 ALA 
RNNAEVYGIADK1EFICGDFLLLASFI.KADWFLSPPW3GPDYA 
TAET FDI RTMMS PDGFE3 FRLS KK1TNN I VY F LPRN AD3 DOV AS 
LAGPGGQVEIEONFLNNKLKTITAYFGDLIRRPASET 


6304 
~ 6 3 OT 


] 


143fc 


H RAR VDR S RES PGGDLRH PGRVRRD I TLSGHF R LSTQHVVLLR E 
DEVCDPGTKDLGHPQHGSP 1 QETQSEWTLVS PLPGSDttAALPA 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQOAEDPTLASGAYQ 
K PGS VEKLQGS VWCDAETLLS S SRTGGQ A P P WLTDHDVQW >RLL 
ADGEVVDKAR V PAHGOVLOVG r STEAALODLS S P RLSQLCSOGL 
CGI, I KRPGDLPEVLSFHVDRVLGIJRRSLPAVARRFHSPLLPYRY 
TDCGAR P V I WWAPDVQHLSD PDSDQNS 1ALGWLQYQALLAH s CN 
WPGOAPCPGIHHTEWARLAXFDFLIOVHDRLDRYCCGFEPEPSD 
PCVEERLREKCRHPAELRLVHILVRSSDPSHLVY1DKAGNLQHP 
EDKlJ^FRLLEGIDGF?ESAVKVLASGCLONMbLKSLOMDPVFWE 
SOGC-AOGLKOVLQTLEQRGOVLLGKIOKHNLTLFKDEDP 


By 


420 


m 3 WRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPOQEEPPTES 
RDPAFGQEREEIX>GAA£TCVPDLEADLOEI>SOSKTGDECGDGPD 
VQGKl LTKSEQFKMPEGR 


6306 


1 


1874 


PTR PS KVKVPHTFL I HS YT RPTVCQACK KXL.KGLFRQGLOCKDC 

vcMruvrj^BTDifONnTT otr 75 t TMr^nx/DMrtraTDFCP ttDlC^ZiTAin 
Kr rvurl fVKL>VI KVKWU^i^c-AUjJ.lv^iJVi'rit.iV/Vi lyrocMi/ivoMJjJH/ 

ESEDSGVI PGSHSENALHASEEEEGEGGKAOSSLGYI PLMRWQ 

SVRHTTRKSSTTLREGWVWYSNKI»TLRKRHYWRLDCKCITLFO 

NNTTNRYYKEIPLSEILTVESAONFSLVPPGTNPHCFEIVTANA 

TY FVGEMPGGTPGGPSGQGAEAARGWETAI RQALMPV I LQDAPS 

APGKAPHRQASLS ISVSNSQ3 QBNVDIATVYQI FFDEVLGS3QF 

G WY GG KHR KTGRDVAVKV I DKLR ? PTKQE SQLRNE VA I LQSLR 

HPGIVNLECMrETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 

TKFLI TQ1 LVALRHLHFKNI VHCDLKPENVLLASADPFPQVKX.C 

DFG FAR 1 1 G EKS FRRS WGTPAYLAPEVLLNQG YNRSLPMWS VG 

V I KATSLSGTFPFNEDED1 NDQ10NAAFMYPASPWSH1 SAGAI D 

LIIWLLQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 

YITKESDDARWE0FAAEHPLPGSGLPTDRDLGGACPPQPHDMOG 
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SEC 
ID 
NO- 


Predicted 
beginning 
nucleot ide 
iccstion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
locat ioj. 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cyereme, D=-Aspartic Acid, E= 
Glutamic Acid, F^Fnenylai anine, G-Glycine, 
H»Histidine, 1=1 scl eucint , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine. 
P=Proline, Q=Glutarcune, R=Arginine, 
S*Serine, T=Threonine , Vi Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LAERISVL 


6307 


2136 


58 S 


CFLLPKGRDPEPPEAGAAAPCAPGAPDMSFRKVVRCSKFRKVFG 
0PVKND0CYEDIRVSRVTWDSTFCAVNPKFLAV1VEASGGGAFL 
VLPLSKTGRIDKAYPTVCGHTGPVLD3DWCPHNDEVIASGSEDC 
TVMVWQ1 PErJGLTSPLTEPVWLEGHTKRVGI I AWHPTARNVLL 
SAGCDNWLZ1WVGTAEELY1RLDSLHPDL1YNVSWNHNGSLFCS 
ACKDKS VRI IDPRRGTLVAEREKAliEGARPMRAI FLADGKVFTT 
GFSRMSEROLAL'WDPENLEEPMAbOELDSSNGALLFFYDPDTSV 
VYVCGKGDSSIRYFEJTEEPPYJHFLNTFTSKEPQRGMGSMPKR 
GLEVSKCEIARFYKLKERKCEFJVMTVPRKSDLFQDDLYPDTAG 
PFAALSAEEVTVSGREADPI LI SLREAYVPSKQRDLKXSRRNVLS 
DSR PAMAPG S SHLGAP ASTTTAADATPSGSLARAGEAG K LEEVM 
OELRALRAL VKECGDR I CR LEEQLGR MENGDA 


6308 


2 


1116 


GRPTRPEKMLLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR 
LLSAFLPARFYQALDDRLYCVYQSMVLFFFENYTGVQILLYGDL 
PKNKENI J YLAJTOQSTVDWI VADI LAI RQNAU3HVRYVLKEGLK 
WLPLYGWYFAQHGG1YVKRSAKFNEKEMRNKL0SYVDAGTPMYL 
VIFPEGTRYNPEO'rKVLSASOAPAAORGLAVLKHVLTPR I KATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGORRESPTMTEFLCKECP 
K1H1HIDRIDKKDVPEE0EHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRFPGKSVNSKLS 1 KKTLPSMLILSGLTAGMjMTDAGRKL 
YVNTWI YGTLLGCLWVT1 KA 


6300 


220 


S63 


lvaevkepcslpmlsvdmenkengsvgvkksmengrppdpadwa 
vmdwnyfrtvgfee0asafoe0eidgkslllmtrkdvltglql 
klgpalkiyeyhvkplctkhlknnss 


6310 


36 


973 


GPRCWKFLILSSVNCETLRlGKAWPOSSGQERYViTPRTHSSA.£=; 
AQRGSLA3LNVAAAGLWADCDQPLYDCPMCGL1 CTWYHI LQEHV 
DLHLEENSFQCJGMDRV0CSGDL0LAJIQLQQEEDRKRRSEESROE 
IEEFQKLQRQYGLDNSGGYKQOOLRNMEIEVNRGRMPPSEFHRR 
KABMMESIALGFDDGKTKTSGIIEALHRYYOWAATDVRRVWLSS 
WDH FH $ S LGDXG WG CGY RN FQMLLS S L LQNDAY NDCI »KGM L I P 
CI PKIOSMIEDAWKEGFDPOGAS0LI IRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


67b 


PVWWNSCEGPRU^AAARTGHGVGRRARIACLGEPRVKAAVKLTL 
ASKLKRPDGLKGSRTAATASDSTRRVSVREiKLLVKEVAELEANL 
PCTCKVliFPDPNiaHCFOL'JVTPDEGYYQGGKPOFETEVPDAYW 
M V? PKV K CLTK I WH PN I TETG E I CLS LLR EHS I DGTGWA PTRTL 
KDWWGLNSLFTDLLNFDDPLN I EAAEHHLRDKEDFRNK VDDY I 
XRYAR 


6312 


213 


1400 


GDELVKREAGMKWI.PGVGVFGTGSSARVLVPLLRABGFTVEALW 
GKTEEEAKQLAEEMN I AFYTS RTDD I LLHQDVDLVCIS I PPPLT 
RQI S VKALG I GKKWCEKAATSVDAFRMVTASRYY PQLMSLVGN 
VLRFLPAFVRMKQLI SEHYVGAVMI CDARIYSGSLLSPSYGWI C 
DELMGGGGLHTMGTY 1 VDLLTHLTGRKAEKVHGLLKTFVRQNAA 
1RGIRHVTSDDFCFF0KLMGGGVCSTVTLNFNMPGAFVHETVMVV 
GSAGRLVARGADLYGQX^SATOEELLLRDSLAVGAGLPEOGPQD 
VPLLYLKGMVYMVOALROSFOGC<5DRRTWDRTPVSf4AASFEDGL 
YMQSWDAI KRSSRSGEWEAVEVLTEEPDTNQNLCEALQRNm. 


6313 


2 


2071 


CRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRJiRAHGCTLSCSSFVEOPTAMEAEETMECLQEFPEKHKMILD 
RLNEQREODRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TOEPLVEIEGVSKMAFRHLI EFTYTAKLMIQGEEEANDVWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTI'GKNBAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVXDEGIETLEEVASAKQSVK 
Y IQSTGS SDDSALALLAD I TSKYPOGDRXGQI KEDGCPSDPTSK 
OVEG I E I VELQLSHVKDLFHCE KCHRSFKLPYUFKEHMKSHSTE 
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SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
"i ocatiun 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieot. i cc 
iocs tier, 
corresponding 
to firs: 
amine acid 
residue of 
amino acid 
seqver.ee 


Aer.mo ac2ri segment containing sionai peptide 
(A^Alunint:, C=Cysleine, D^Aspartic Acid, E= 
Glutamic Acid, F=Fhenylaianine, G=Glycine, 1 
H-Histidine, I=3so)eucine , K-l»ysine, 
L=Leucine. M=Methicmne, N^Asparagine , 
P=Proline, Q^Glutsrome, R=Argin:ne, 
S^Serine, T=Threonine / V^Valir.e, 
W=Tryptophan f Y=Tyrosine, X=Unknown, *=Stop 
Codon, Apossiblc nucleotide deletion, 
Vpossibie nucleotide insertion) 








SFKCEICNKRYLRESAViKQHLNCYHLEEGGVSKKQRTGKK.lHVC 
QyCEKOFDHFGHFKEHLRKHTGEKPFECPNCHfRFARNSTLKCK 
LTACOTGVG AK KGRKKl/y ECQVCNSVFNSWDQFKDHbVI HTGDK 
PNHCTLCDLWFMCX5NELRRHLSDAHNISERLVTEEVLSVETRV0 
TEPVTSMT1 1 EQVGKVHVLPLLOVQVPSA0VTVEQVHPDLLODS 
QVHDSHMSE LP EQVQVS YLSVGRIQTEEGTE VHVEELHVERVNQ 
M P V E VQTE b L E AD LDf J VT P 2 1 MNQE E RES SO ADAAE AAR E DH ED 
AEDL.ETKPTVDSEAEXAENEDRTALPVLE 


6314 


2 


2C71 


ORSGAARLAhLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCT1.SCSSFVE0PTAKEAEETMECLQEFPEHHKMILD 
RLNEQRECDRFTDlTLJVDGHHFKA>IKAVLAACSKFFYKFFOEF 
TO E P b V E 1 EG VS KMAFR K L I E FT YTAKbM I OG E E EAND VW KAAE 
FLQMbEAl KALEVRNKENS APLEENTTGKNEAKKRKI AETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKOSVK 
YIOSTGSSDDSALALLADI TSKVR0GDRKG0I KEDGCPSDPTSK 
OVEGIEIVELObSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
SFKCEICNKRYLRESAWK0HI>NCYHLEEGGVSKKQRTGKK1HVC 
OYCEKQFDKFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTAC0TGVGAKKGRKKLYEC0VCNSVFNSWDOFICDHLV1 HTGDK 
PNHCTLCDLWFMOCNELRRHLSDAlINI SERLVTEEVLSVETRVQ 
TEPVTSMT3 J EOVGKVHVLPLM)VQVDSAQ\nTVEQVHPDbLQDS 
QVHDSHMSELPEQVQVSYLEVGRJQTEEGTEVHVEEbHVERVNQ 
MPVEV0TELLEADLDHVTPEIMNOEERESSOADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1011 


IjGUAVKWTTLVLI sycptateeapywtyllcalglfxyoslda 

IDGXQARRTJ\\S CSPLGELFDHGCDSLSTVFMAVGASIAARLGTY 
PDWFFSCSFJGMFVFYCAHWOTYVSGMLRFGKVDVTEIOIALVI 
VFVI.SAFGGATHWDYT1P1LEIKLK1LPVLGFLGGVIFSCSNYF 
HVII.HGGVGKNGSTIAGTSVLSPGLHIGLII1LA1M1YKKSATD i 
VFEKHPCLYI LMFGCVFAKVSOKLWAHMTKSELYLQDTVFLGP j 
GLLFLDOYFNNF1DEYWLWMAMVISSFDMVI YFSAbCbQlSRH 
IiHbNI FKTACHOAPEQVQVLSSKSHONMMD 


6316 


1503 


79> 


VSAGAGTGI f-JGGTTSTRR VTKEADENEN ITWKG 1 RbSENV I DR 

mkesspsgsksqfysgaygasvsdeelkrrvaeelal5qakkes 
edqk r lkqak fldreraaan eoltra i lrer i cs eeerakakhl 
aroleekdrvlkkodafykeqlarleerssefyrvtteqygkaa 
eeveakfkryfshpvcadLoakilocyrenthotlkcsalatqy 

MHCVNHAKQSKbEKGG 


6317 


102 


839 


peaqtsav1arekghlptmrheapwma5a0daryg0kdssdon 
fdymfklliignssvgktsflfryaddsftsafvstvgidfkvk 
tvfknekrik1.0iwdtag0eryrtittayyrgamgf1lmyd1tn 
eesfnavqdwstoiktyswdnaovilvgnkcdmedervisterg 
qrlgeqlgfeffetsakdninvkotferlvdi i cdkmsesletd 

PAITAAKQNTRLKETPPPPOPNCAC 


6318 


1765 


733 


PWHPLRTLPbHHPHPKPPRAEGREGADSMSHbPGLELRREAPPb 
LGPLLSPFPLPAGSWHROMLRSSLRFPJTNSAGAPCKAAGRMNI 
LAPVRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACOEHRTGT 
VGFKISKV1WGDLSVGKTCL1NRFCKDTFDKNYKATIGVDFEM 
ERFEVLGI PFSLQLWDTAC-QERFKCIASTYYRGAQA1 1 I VFNLN 
DVASLEHTKOWLADAbKENDPSSVLLFbVGSKKDLSTPAOYALM 
EKDALOVAOEKKAE YWAVSSLTG ENVR EFFFR VAALTFEANVLA 
ELEKSGARR I GDWR I NSDDSNbYLTASKKKPTCCP 


6319 


88 


717 


AATWRLNQNTbLbGKXVVbVPYTSEHVPSRYHEWMKSEELQRLT 
AS EPLTLEOEYAMOCSHQEDADKCTFI VLDAEKWQAQ PGATEES 
CMVGDVNLFLTDbEDLTLGE 1 EVM I AEPSCRGKGbGTEAVLAML 
SYGVTTLGLTKFEAXIGOGNEPSIRMFOKLHFEQVATSSVFOEV 
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ID 

NO: 
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beg limine 
nucleot ide 
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corresponding 
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amino acid 
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Predicted end 
nucieot 3 de 
location 
corresponding 
to firsL 
amino acid 
residue of 
amino acid 
sequence 


Amino acid SGQment containing signal peptide" 
(A=Alanine ( OCysteine, D.-rAspartic Acid, fc = 
Glutamic Acic, ^Phenylalanine , G^Giycune, 
H-Hist idint.. I = 1 soleucine , K-Lysine, 
L=beucine, MrMethionine, N=Asparaoine . 
P=Proline, O-Glutamine, R=Arginine, 
SsSerine. T= Threonine, V=Valine, 
W= Tryptophan, Y-Tyroeine, X*Unknovm, *=Stcp 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLIVSESEHQWLLEOTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAK.AAVDSFYLLYREIARSCNCYMEALALVGAWYTA 
RKSITVICPFYSL3PLHFIPRLGSRADL1KQYGRWAWSGATDG 
JGKAYAEELASRGLNI ILISRNEEKLQWAKDIADTYKVETDI 1 
VADFSSGRE I YLP I R EALKDKDVGI LVNNVG VFYPYPO Y FTQLS 
EDKLWD1 1NVNI AAASLMVWVLPGMVERKKGAI VTI SSGSCCK 
PTPOLAAFSASKAYLDHFSRAIiOYEYASKGl FVQSL1 PFYVATS 
MTAPSNFLHRCSWLVPSPKVYAHHAVSTLGISKR1TGYWSHSIQ 
FLFAOYWPEWLWVWGANI LNRSLRKEALSCTA 


6321 


3.41B 


34'. 


HR KAA1X- AbMAGRLLG KALAAVS LS LALASVT I RSS RCRG 1 OAF 
RNSFSSSWFHLNTKVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVIiAGPRWADPQISESNFSPK 
FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 

I PGGMVDPGEKI SATLKREFGEEALNSLQKTSAEKRE IEEKLHK 
LFSQDHLVI Y KGYVDDPRNTDNAWMETEAVNYHDETGE 1 MDNLM 
LEAGDDAGK V K WVD 3 N DKLKL YASH SQFI KLVAEKRDAH WS EDS 
EADCHAI 




2 (M 7 


i n o "j 


WS I NCCDDGEGSOOEE V I SSED J GAS I FNGQXKVL Y YADA LTE I 
AFWPSPVESLTDSLESNlSDODSDSNMDLMPGILKOPSbTLEL 
FPNHTDNLNSSORLSPSSRMRKLPOGRPVPPI^GPETRVSWWVE 
RYDDI ENFPX.SELMTE3 STGVETTANSSTSLRSTTLEKEVPVI F 
IHPLNTGLFR I K30GATGKFNMVI PLVDGMI VSRRALGFLVROT 
V1N1CRRKRLESDSYSPPHVRRKQK1TDIVNKYRNK0LEPEFYT 
SLFQEVGLKNCSS 


6323 


2 


65 f 


PA S TTDG AQE AR V P LDG A F W I PR P PAGS PKG CFAC VS K P P ALQA 
PAAPAPEPS AS PPMAPTLFPMESKSS KTDSVRAAGAPFACKHLA 
EKXTMTN PTTV 1 E VY PDTTEVNDY Y IMS I FN FVY I ,N FCCLG F I A 
LAYS LK VR DK Kl XNDLNGA VEDAXTDR LI NITRS GLiAAS C I MLW 
MALSVIATHRGLRSSASILVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLMEAEAGALPAAARMGLEAPRGGRRROPGOO 
RPGPGAGAPAGRFEGCCPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVSTERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEASPWTQPGVHGP 
WTELETHGSOTOPERVKSWADNLWTHONSSSLQTHPEGACPSKE 
PSADGSWKEL YTDGSRTOODI EGPWTEPYTDGSQKKODTEAAR K 
QPGTGGFOIOQDTDGSWTQPSTDGSOTAPGTDCLLGEPEDGPLE 
EPEPGELLTKLYSHLKCSPLCPVPRLIITPETPEPEAOPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS P FVV S FR KHYPKVQLSGHAGHFQAGEDGR I bKRFCQC 
EORSLEQLMKDPLRPFVPAYYGMVLODGQTFNQKEDLLADFEGP 
SIMDCKi1GSRTYLEEELVKARERPRPRK3>lYEKMVAVDPGAPTP 
EEHAOGAVTKPRYMOWRETMSSTSTLGFRIEGI KXADGTCNTNF 
KKT0ALEQVTKVLEDFVDGDHV1LQKYVACLEELREALE1SPFF 
KTHEWGSS LL FVHDHTGLAXVWM I DFGKTVALPDHQTLSHR LP 
WAEGNREDGYLWGLDNMICLLQGLAOS 


6325 


165 " 


94< 


GLRDPFRRKRRLKPOVKMSNYVNDMWPGSPOEKPSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHO 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRS RTP FRLSEKDRMELLE I AKTNAAKALGTTN I DLPASLRT 
VPSAKETSRG1GVSBNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


GEPS PATQOKPS ATGAGVLHQHFSSGH I YVLMGLLP P ?WT 1 S FT 
VOTTL0PPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRT0TL 
GARRVAAOGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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ideation 

ccr responding 

to first 

amino acid 
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amino acid 

cecuence 


Predicted end 
nucleotide 
iocat i on 
corresponding 
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amino acid 
residue of 
ammo acid 
sequence 


Atudo acid eecnient ccntaanang signal peptide \ 
(A=Alonme, C=Cysteine / D=Aepartic Acid, E- 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, 1 = 1 sol cu c i ne , K=Lysine, 
L=Leuoine, M=Methionine, N^Asparagine , 
?=Proline, Q=Glut amine , R=Arginine, 
S-Serine, T^Threonine, V*Valine, 
W^Tryptophan, Y*Tyrosine, X=Unkno\*n, * = Stoo 
Codor., /-possible nucleotide deletion, 
\-poss5ble nucleotide insertion) 




i 


QAWGGVGOEASSGV* 


6327 




1337 


SLARIAPAGGSVVMPTQQPAAPSTRAPKPSRSLSGSbCMJFSDA 
DSGSGMKAELPPGPGAVGREMTXEEKLQLRKEKKQOKKKRKEEK 
GAEPETGSAVSAAQCOGPTRELPESGlQliGTPREKVPAGRSKAE 
LRAERRAKOEAEKALKOARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPOVT>r>T.i ,i.RB'LVXKPFRr>n\7PTR'KnYr;«;KV«;i.v^uT povq 

ur t» i * s/ * is ui-ii->LJ i\f- u v fs, i\ tr E*r\\^\^ v r i *\ i\u i uoi\ v o L>r url ury J o 

R QNSLTQFMS I F S S V I HPAMVR LGLOVSOGLVRGSNAR CI ALLR 
AUX)V 1 ODYTTPPNEELSRDLVNXUKPYMS F1,TQCRPLS ASMHN 
AIKFLNKKITSVGSSKREEEAXSELRAAIDRYVQEKIVLAAOAI 
S R FAYOK I SNGDV I LVYGCS SLVSRI LQEAWTEGRRFR'sTWVDS 
RPWLEGRHTLRSLVHAGVPASYLLIPAASyVLPEVSTESKDSKV 
GGEKV 


6328 


1030 


276 


}1AS AE VTTAAARGU3 AMEEEMHTDAK 3 RAENGTGS S PRG PGCS 1* 
RHFACEQNLLSRPDGSASFLQGDTSVLAGVYGPAEVKVSKE1FN 
KATLEVI LRPKIGbPGVAEKSRERLIRNTCEAWLGTLH PRTSI 
T W LQ W S DAGS L1A C CIjN AAC MALVDAGV PMRALFCG V AC ALD 
S DGTLVLDPTS KOE KE ARAVLT FALDS VERXLLMS ST KG LYSDT 
EL0OCLAAAOAAS0HVFRFYRESLORRYSKS 


6329 


3 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGbNNGAGGTSArr 
SN PLSRKLHK I LETRlrDNDKEMLEALKALSTFFVENS LRTRRN1> 
RGDI ERKS LAINEEFVS I FKEVKEELES 3 SEDVQAMSNCCQDMT 
SRbOAAKEQTODLI VKTTK1X)SESQKL-E I RAQVADAFLS KFQLT 
SDEMSLLRGTRKG V 1 TEDFFKALGRVKQ3 HNDVKVLLRT^OOTA 
GLEI MEOMALLQETAY ERLVR WAQSECRTLTQESCDVS P VLTOA 
MEALODR PVLY KYTLDE FGTARRSTWRGF 3 DALTRGG PGGTPR 
PlEMHSHDPLRYA/GDMl^WLHQATASEKEHLEALLKKVTTQGVE 
EN J 0EWGH I TEGVCR PLKVR J EQVI VAEPGA VLL YK I SNLLKF 
YBHT I SG I VGNS AT ALLTT I EEMHIibSKKI F FNS LSIiHAS KLMD 
KVELPP PDI^PSSA1jKC)TIjMLL.R EVl^SHDSSVVPLIJAPQADrV 
OVLSCVLDPLbQMCTVSASNl^STADMATFMVNSLYMMKTTLALF 
EFTDRRLEMLQFQI EAHI.DTL3 NEQASYVLTRVGLSYI YNTVOQ 
H KPECCS LANM PNLDS VTLKAAMVQFDR YLSAPDNLL 3 PQLNFL 
LSATVKEQ 3 VKQSTELVCRAYGEVYAAVMNP3 NEYKDPEN I LHR 
SPGQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTrSRKMVAEKETLSliNKCPDKMPKRTKLLAOOPL 
PVHQFHSLVSEGFTVKAMMKNSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFP1ALEHDSKGNK3AWKVEIEKXDYHHYLPLFFDGLC 
EMTFPYEFFAROGIHDMLEHGGNKILPVLPOLIIPIKNALNLRK 
RQVICVTLKVI^HLWSAEMVGKALVPYYROILPVLNIFKNMNV 
NSGDGI DYSQ0XREN3 GDLI QETLEAFER YGGENAP3 N3 KYWP 
TYESCLLN 


6331 


3 




QQGQR VR TRG R RA CAS ATPLEGCVDLS YPRTHAAJjLKVAQMVTL 
L1AFICVRSSLWTNYSAYSYFEWTICDLIMILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNCSGLVAGA 
I FGFMATFLCMAS1 WLSYKISCVTOSTDAAV 


6332 


1 


878 


.VTESNKFDLVSF1PLLRERIYSNNQYAR0FI1SWILVLESVPDI 
NLLDYLPE 2 LDGLFQI1XJDNGKJS I RKMC2WLGEFLKE I KKNPS 
SVXFAEMANlbVIHCQTTDDLIOLTAMCWMREFIQLAGRVMLPY 
SSGILTAVLPCLAYDDRKKSIKEVANVCNQSLMKLVTPEDDELD 
ELRPGOR0AEPTPDDALPXOEGTASGEWTPSLHLTSCRGPREPD 
V3GVALGPHLSNQDYFMYVTHTIVAAT0RSGSSGSPPFCRODTG 
KLSTMATHSOLVXTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRTPSEAEAGGES POSCVSAAHSDWTAGKPVSLIAPLI PPRSAG 
QPLTFSPSGRQPLRSIiLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGH PGMH YAPMGMH PMGQRAKMPP VPHGMMPQMMPPNGGPP 
KGQMPGMMSSVMPGMMMSHT^SQASMQPAIJPPGVNSMDVAAGTAS 
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Prcc^cted 
begjnnino 
nucleotide 
locst-or. 
corresponding 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
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amino acid 
sequence 


Amnc acid seoment containing signal peptide 
(A=Aicr,ine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, P«Pher.ylal aninc , G=Glycine, 
H«Hist i dine, I~Isoleucine, X=Lysint, 
L=Leucine, ^Methionine, N=Asparagint , 
psProline, QsGlutamine , ReArginine, 
S=Serine, T*Threonine, V=Valine, 
w=Tryptophan, Y=Tyrcsine, X=Unknown, *=Stop 
Codon, y=possibl« nucleotide deletion, 
Vpoesible nucleotide insertion) 








OAKbntN x bHKb FLXjhl Y I jN lEi KQSThEK PDDLKT P AEQLuo K 
CPWKayKSDSGXPYYYWSOTKESRWAKFKELEDLEGYONTlVAG 
SLITKSNLHAMI KAEESSKQEECTTTSTAPVPTTKl PTTMSTMA 
AAE AAAA VVAAAWJXAAAAAANANASTSASNTVSGT V PWPE P 
bv 1 5> 1 vA J v VUWt.NI V r± SThc.(jAQLfl STPAl yDCJSVEVSSNTG 
E ETS KCETVAD FTPKKEEEESC PAK KTYT WNTK EEAKO A FKELli 
KEKRVPPNASWEQAMKMIINDFRYSALAKLSEKKOAFNAYKVQT 
EKK 


6334 


. n 


644 


GGNPSGRAAGFAA'\AMPSSPLRVAVVCSSN0NRSMEAKN1LSKR 
GFSVRSrcTGTHVKLPGPAPDKPNVYDFKTTYDQMYKDLLRKDK 
ELYTOK<;ii.HKbDRNKRIKPRPERFONCKDLFDLlLTCEERVYD 
QWEDLNS REOETCQPVHWNVDI QDN1IEEATLG AFL 1 CELCQC 
lOHTEDt/JENEIDELI^EFEEKSGRTFLHTVCFY 


6335 


8^ 


529 


AARAR FGVLCCRLLGAALGDCSRVEMS y 1 PGQPVTAWORVEIH 
KLROGEK^Ll LGFS1 GGG1DQDPSQN PFSEDKTDKG1 YVTRVSEG 
GPAE3AGL03GDKI>XWGWDMTMVTHDQARKR1jTKRSEEVVRL 
LVTROSLOKAVQQSMLS 


6336 


1003 


438 " 


he pas kgraevgnmrls vaaa j shgrvfrrmg lg pesri hllrn 
lltglvrherieapwarvdemrgyaeklidygklgd7neramrm 
adpwltekdlipklpovlapryxdqtggytrmloipnrsldrak 
mavieykgnclpplplprrdshltllnolloglrodlkosqeas 
nhsshtaqtpgj 


6337 


7< 


524 


EG10MLSV0PDTKPKGCAGCNRKIKDRYLLKALDKYWHEDCLKC 
ACCDCR • G EVGS TLYTKANLI LCRRD YI.RL FGVTGNCAACS KLI 
PAFEM VMRAKDNV YHLDCFACQLCNQR FCVGDKF FLKNNMI LCQ 
TDYEEGLMKEGYAPQVR 


6338 


6fc 


1349 


APNSESGTOGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GLR lAJb LliLLGLGTP KSGVQG 0 EGLD FP E YDG VD K V I NVJM A KN Y 
KNVFKKyEVLALLYHEPPEDDKASQROFEMEEIilLELAAOVLED 
KGVGFG LVDSEKDAA VAKKLGLTE VpSM YVFKGDEVI EY DGEFS 
ADTIVEFLLDVLEDPVELJEGERELQAFENIEDEIKL1GYFKSK 
DSEH Y KA FEQAAEEFHPYI PFFATFDSKGAKKLTLKLNE 3 DFYE 
AFKEEPVTIPDKPNSEEEJVNFVEEHRRSTLRKLKPESMYETWE 
DDMDG 1 H 1 VAFAEEADPDGFE FLETLKAVAQDNTENFDLS I J W I 
DPDDFPLLV/PYWEKTFDIDLSAPQIGVVWTDABRLWMEMDDEE 
DLPS AE E LEDWLEDVLEGE I N TEDDDDDDDD 


6339 


246 


3813 


NRCDRGGGGQAERQAGOGCRTOGAGPGFGFGHSFFSOGAMKAFH 
T FC W 1»LV FGS V SE AK FDDFEDEED3 V £ Y DDND F AEFED VMED S 
VTESPORVI ITEDDF.DETTVEIjEGQDENOEGDFEDADTQEGDTE 
SEPYDDFEFEGYEDKPDTSSSKNKDP1T1VDVPAHL0NSWESYY 
LEI LM VTGL.LAY 1 MN Y I 1 GXNKNi»KlAAyAW Yj* i liK h.LiL»t.S?J1 r 1 L» 
VGDIX^TNKEATSTGXLNQENEHIYmWCSGRVCCEGMblO^RFT. 
KR QDLLNV LARMMR PVSDQVQ 1 KVTMNDEDMDT YV FAV GTRXAL 
VR LQK E WQDIjSE FCSDKPKJSG A K YGLPDS bA3 LS EMG EVTDGMH 
DTKMVHFLTHYADKI ESVHFSDQFSGPKI MQEEGOPLXLPDTKR 
TLLLTFJWPGSGNTYPK DMEALLPLWNM V I YS I DKAKK F*RLnRE 
G KQKADKNRARVEENFLKLTKV0ROEAA0SRREE K KRAEKEP IM 

NE EDPE korrleeaalrreqkklekkomkmxqi kv KAK 


6340 




583 


EACA«TLSCPAFARU5RJ^RRPWMSHRTSSTFRAER$FHSSSSS 
SSSSTSSShS RALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGNI KTLGDAYEFAVDVRDFSPEDI I VTTSNNHI EVRA 
EKLAADGTV>rWFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
HPHTEKVQQTFRTEIXI 


6341 




€-4 5 


KKAVXSAPGLRGFRILGLRSSVGPAVQARGVHOSVATDGPSSTQ 
PALPKARAVAPKPSSRGEYVVAKLDDLVNWARRGSLKPMTFGLA 
CCAVEKWIKAAPRYDMDRFGVVFRASPROSDVMIVAGTLTNKMA 
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BNSDOCIO: <WO_01S3312A1_I_> 



WO OJ/53312 



P<:T/l)S0O/34263 



SF.Q 
ID 
NO: 


Preti jcced 
beginning 
nuclcct ide 
iocat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seenent containing signal peptide 
(A=Al?nuie, C=Cysteine, D-Aepartic Acid, E= 
Glutamic Acid, F<= Phenyl alanine, G-Glycine, 
H^Hirtidine, I^lsoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P^Prcline, Q=Glutamine, R=Arginine, 
S = Seririe, ^Threonine, V^Valine, 
W= Tryptophan, Y=Tyrosine, Xt=lTnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion) 








PALRKVYbQMPEPRYWSMGSCAWGGGYyHYSysWRGCPRIVP 
VD I Y I PGCPPTAEALLYG I bQLQRKI KKERRLQIWYRR 


6342 






DPRVRAMIATIARVAALRKTCLFSGSGGGRGLWTGRPQSDMN-NI 
KPLEGVKI LDLTRVLAGPFATMNLGDU5AEVI KVERPGAGDDTR 
TWG? P r VGTEST YY LS VNRNKKS 1 AVN I KDPKGVKI 1 KELAAVC 
DVFVENYVPGKLSAMGLGYED1BEIAPH1IYCS1TGYG0TGP1S 
ORAGYDAVASAVSGLMH I TGPEVACLSH1 AANYL2 GQXEAKRWG 
TWiGSlVPYQAFKTKDGYlWGAGlWOOFATVCKlLDLPELIDN 
SKYKTNHliRVHNRKELlKlI^ERFEEELTSKWLYLrFEGSGVPYG 
P INNMKNVFAEPOVUmGLVMEMEHPTVGKl SVPGPAVR YSKFK 
MS EAR PF PLLGQHTTH 1 MCEVLR Y DDRA1 GELLB AGWDQHETH 


6343 




33b 


GTAMVSDEDELNLLV1 VVDANP 1 VfWGKOALKESQFTl>S KCI DAV 
MVLGNSHLFMNRSNKLAVIASH10ESRFLYPGKNGRU5DFFGDP 
GNP PE FN PS GS KDG K Y E LLTS ANE V I VEE I KD LMTK S D I KGQHT 
ETLLAGSLAKALCY3HRWNKEVKDN0EMKSR1LVIKAAEDSALQ 
YMNFMNV I FAAQKQNT b I DAC VLDSDSGLLOQACDI TGGLYLKV 
POMPSLLQYLLWVFLPDOnORSOLILPPPVHVDYRAACFCHRNL 
IEIGYVCSVCLSIFCNFSPICTTCETAFKlSLPPVLKAKKKKZiK 
VSA 


€34 4 




14 7 


TM PTA TLGNL RG Y G MA S PG LAAP S 1>T P PQLAT PNLQQFF PQATR 
OS LLG P P PVG VPMNPSQ FNLSGRN POKQARTSS STTPNR KDSSS 
QTMPVEDKSDPPEGSEEAAEPRWDTPEDODLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRI.RSSEEPTEKEPPGQLQVKAQPQARMT 
VPKQTOTPDLLPEALEAQVIjPRFCPRVLQVQAQVOSQTQPRIPS 
TDTQVQPKLOKQAQTQTSPEHLVLQQKQVQPQLQQSAEPQKOVQ 
PQVQPOAHSQGPRQVQLQOEAEPLKQVQPQVQPOAHSOPPRQVQ 
LOLQKQVQTQTYPQVHTQAQPSVQPQEHPPA0VSV0PPEQTHEQ 
PHTQPOV SLLAP EQT P WVHV CGLE M P P DAVEAGGGMEKTL PE P 
VGTQVSMEEIQNESACGLOVGECENRAREMPGWGAGGSLKVTI 
LOSSDSRAFSTVPLTPVFRPSDSVSSTPAATSTPSKOALQFFCY 
I CKASCS SQQEF0DHMSEPQHQQRLGE1 QHMSOACLUSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDL10HRRT0DHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKS0GHKDKAKEXKSJUEKE1AG0DE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGSETYSPNTAYGVDFLVPWX5YICRICHKFYHSNSGAQL 
SHCKSLGHFENLQKY KAAKN P SPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDXTPSKVTARPSQPPLPRRSTRLKT 


6345 


2 


3483 


PRVRTKLI LLVKDKKR Y ERVGGGPKRLGRDVEMEEMI EQLQEKV 
HELEK0NDTLKNRLISAXQ0LQTOGYR0TPYNNVOSRINTGRRX 
AN ENAG 1 jQECPR KG I KFODADVAETPH PM FTKY GNS 1»LEE ARGE 
I RNLENV I QSCRG01 EELEHLAE1 LKTQLRRKENEI ELSLLQLR 
EOQATDOR SN I RDNVEK I KLHKQLVE KSNALSANEGKF1 OLQEX 
QRTLK1SHDALKANGDELNMQLKE0RLKCCSLEKOLHSMKFSER 
R I EELQDR 1NDLEKERELLKENYDKLYDSAFSAAHEEQWKLKE0 
OLKVQIACiLETALKSDLTDKTElLDRLKTERDONEKliVQENREL 
QLQYLEQXQQLDBLXKR I XLYNQENDI NADELSEALfcLIKAQXE 
OXNGDLS FLVKVDSEJNKDLERSMRELOATHAETVOELSKTRNM 
MMQHKI NKDYOMEVEAVTRXMENL0ODYELKVEQyVHl>LDIRA 
AR I HKLEAOLKDI AYGTKQYKFKPEI MPD0S VDEFDETr HLERG 
ENLFE 1 H 1 NiCVTFSS EVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRGLH PE YNFTSO YLVHVNDLFLQY 10KNTI TLEVHQAYSTEYE 
TIAACOLKFHtlLEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RV?MIX)AIRLYR£RAKALGYITSNFKGPEHMQSLSQQAPKTAOL 
SSTDSTDGNLNELHI T I RCCNHLOSRASHLQPHPYWYKFFDFA 
DHDTA1 3 P S SND PQF DDHf4YF P V P MNMDLDR Y LKS B SI «S FYVFD 
DSDTQEN 1 Y I GKVNVPLI SIAHDRC I SG I FELTDHQKHPAGTIH 
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BNSDOCID: <WO 01W312A1,I_- 



WO 01/53312 



PCT/US00/34263 



SEO 
IT' 
NO: 


Predicted 
beginning 
rmcleot ide 
location 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted cad 
nucl eot j dt 
locat lot. 
corresponding 
to first 
amino ocic. 
residue cf 
amino acic 
sequencf 


Amino acid segment containing signal peptide 
(A=A2onine, ^Cysteine, D=/\5partic Acid, E= 
Glutamic Acid, F-Phenylalanine , G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N:=Asparagine, 
P=Proline, Q=GlutawinE , R— Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=?ryptophan, Tyrosine , X«Unknown, * =Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKPAYLPPSGSJTTEDLCNF3RSEEPEVVQRLPPASSVST 
LVliAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQEGSVDEVKEN 
TE XMQCGKDDVSLLSEGQLAEQ S LAS S EDETE 1 TEDL E PEVEED 
MSASDSDDCI I PGPI SKN I KQPS EK I R1EI I ALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGCWVYYNYSNVIYVDK 
ENNKAKRDILKAlbCKQEMPNRSLRFTVVSDPPEDEODLECEDI 
G VAH VDLADKFQEGRDL 1 EQNJ DV FD ARADGEG I GKLR VTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


53? 


ODRRLLRLELQKTCQPTSTMSGSKTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
G1FHGMRPQLWMRLSGAL0KXRNSELSYREIVKNSSKDETIAAK 
01 EKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYPE IGYC 
QGTGMVAACLLLFLEEEDAFWMMSA1IEDLLPASYFSTTLLGVQ 
TDQRVLRifljIVQYLPRLDKLLOEHDIELSLITLHWFbTAFASVV 

djklllriwdlffyegsrvlfoltlgmlhlkeeeliqsensasi 
fntlsdipsomedaelllgvanrlagsltdvavetorrkhlayl 
i adogollgagtltnlsowr rrtqrr ksti tallfgeddle al 
KAKNI kqtelvadlreai lr varhfoctdpkncswsrclpgll, 

PNTALTPPTPLVCLYSLWQELTPDYSMEGHQRDHENYVACSRSH 
RRRAKALLDFRRHDDDELGFRXNDI IT1 VSQKDEHCWVGELNGL 
RGWFPAXFVEVLDERSKEYS I AGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTMDAVHAQMDVKLRSL 
1 CVGLNEQVLH LWbEVL.CS S LPTVEKW YQPWS FLRS PGWVQ I KC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSH 
DVDG 


6347 


2921 


53 


ODRRLLRbEbQKTCQPTSTtlSGSllTPACGPFSALTPS I WPQE I b 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPORbRWOAHbEFTKNHDVGDbTWDKIAVSbPRSEKbRSbVLA 
G I PHGMRPQLWMRLSG ALQK KRNS EbS YREI VKNSSNDETI AAK 
03EKDLLRTMPSNACFASMGS 1 GVPRbRRVbRALAWLYPEIGYC 
QGTGMVAACLLLFLE EEDAFVIMMSAI I EDLLP AS Y FS TTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLpEHDI ELS LI TLHWFLTAFAS W 
D I KLLLR I WDLF FY EGS R VL FQLT LGMLH LKE EEL 1 Q S EN S AS I 
FNTLSDIPSQMEDAELbbGVAMRLAGSLTDVAVETORRKHLAYL 
IADOGObLGAGTLTNbSOWRRRTORRKSTITALbFGEDDbEAL 
KAKNI KQTELVADLR EAI LRVARH FOCTD P KNCSWS ROLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAXAbLDFERHDDDELGFRKNDIlTlVSOKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYS I AGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLF I EEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVOSVNVTHDAVHAQMDVKIiRSL 
ICVGU^EQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 
ELRVLCCFAFSI^QDWELPAKREAQQPLXEGVRDMLVKHHLFSW 
DVPG 


6348 


3 


3675 


AGAEKCFVTLLACFLAKCONKYKYEECKDLIKSMLRWELQFKBE 
KLAEOLK0AEELROYKVl,VHSOERELTOLREKLREGRDASRSLN 
EHL0ALLTPDEPDKSQGODLOECLAEGCRLAQKLVOKLSPENDN 
DDDEDVQVEVAEKVQKSS SPREMQKAEEKEVPSDSLEECAI TCS 
NSHGPCDSNQPHKN1K1TFEEDEVNSTLWDRESSHDEC0DALN 
1 LPVPGPTSSATNVSMWSAGPLSGEKAAIN I LEINEKLRPOLA 
EKKQQFRNLKEKCFLTQLACFLANQQNK YK YEECKDLI K FMLRN 
ERQPKEEKbAEQLKQAEELRQYXVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSOGQDLOEQLAEGCRLA0HLVQK 
LS P ENDNDDDEDVQVE VXEKVQKS S A PREMP KAEEKEVP EDSLE 
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BNSDOCID: <WO 07KJ3l2Al..»..> 
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r SEQ 
NO: 


Frecii ct ed 
beqi nning 
nucleotide 
J oca t ion 
corresponding 
to first 
airino acid 
residue of 
amino acid 
secuence 


Predicted end 
nucleotide 
location 
conespnnding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I = Isoleucme, K=Lyr.ine. 
L=Leucine, M- Me t hi on j ne , N=Asparacine , 
P=Proline, Q=Glutamine, R=Arc?inine , 
S=Serine, T=Thxeomne / V>=Valine, 
W=Tryptophan, Y>= Tyrosine, X=Unknown, *=Step 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSN0PHRKTK1TFEEDXVDSTL1GSSSHVEW 
EDAVim PENES DUEE EE EKGPVSPRNLQESEEEEVPQESWDEG 
YSTl>SI PPEMLASYKSYSSTFHSLEE0QVCMAVD1GRHRWDOVK 
KEDHEATGPRLSRELLDEKGFEVLODSLDRCYSTPSGCLELTDS 
COPYRSAFYVLEOO^VGLAVMMDEIEKYOEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRI KKDOEEEEDOGPPCPRLSRELLEWEPEVIjOD 
£LDRCYSTPSSCLEQPDSCQPYGSSFYA1,EEKHVGFSLDVGEIE 

kkgkgkkrrgrrskkerrrurkegeedqnppcprlsrelldexg 
pev1,qdsldrcystpsgcl.eltdsc0pyr£afyileqqrvglav 
dmdeiekyoeveedcdpscprlsgelldekepevlqesldrcys 

TPSGCLEITDSCQPYRS AFY I LEQQRVGLAVDMDE 1 EKYQEVEE 
DQDPSCPRLSRELLDEKEPEVljODSLiGRCYSTPSGYlrELPDLGC 
FYSSAVYSLEEQYU5LALDVDR1 KKDQEEEEDOGPPCPRLSREL 
I ,E WE PEVLODS LDRCYSTPSS CLEOPDS COP YGSSFYA LEE KH 
VGFSLDVGEIEKKGKGKKKRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLODSLD1CYSTPSMYFELPDSFQHYRSVFY 
SFEEEHISFALYVDNRFFTLTVTSLHU/FCKGVIFPQ 


6349 


• 


367? 


AGAEKCFVTLLACFLAKOONKYKYEECKDLIKSMbRNELOrKEE 
KLAEOLKQAEELROYKVLVHSOERELTOLREKLREGRDASRSLN 
KnLOALLTPDEPDKSOGODLQEOLAEGCRLAOHLVOKLSPENTJN 
DDDEDVOVEVAEKVOKSSSPREMQKAEEKEVPEDSLEKCAITCS 
N5KGPCDSN0PHKNIKITFEEDEVNSTLWDRESSHDEC0DALM 
ILPVPGPTSSATNVSMWSAGPljSGEKAAlNlLElNEKLRPOlA 
EKKOX)FRNLKEKCFL,TOLACFLLANQONKYKYEECKDLIKFt'iLRN 
EROFKEEKLAEOLKOAEELRQYKVLVHSCERELT0LREKLREGR 
DASRSLNEHI>OALLTPDEPDKSOGODLQE0LAEGCRLAOHLVQX 
LSPENDNDDDEDVQVEVAEKVOKSSAPREMPKAEEKEVPEDSLE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLOESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQ0VCMAVD1GRHRWDQVK 
KF.DHEATGPRLSREliLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
C0PYRSAFYVLEQQRVGLAVTJMDEIEKY0EVEEDODPSCPRLSR 
ELLDEKE PEVLQDS LGRCYS TPSG YLELPDI GQP YSSAVYSLEE 
0YLGLALDVDRIKKD0EEEEDOGPPCPRLSREL.LEVVSPEVL0I> 
SLDRCYSTPSSCLEOPDSCQPYGSSFYALEEiCHVGFSLDVGEIE 
KXG KG KKR R GRR SKKERRRGK KEGEEDONP PCPR LS R ELLDE KG 
PEVLQDSLDRCYSTPSGCLELTDSCOPYRSAFYILEOQRVGLAV 
DMDEIEKYQEVEEDODPSCPRLSGBLLDEKEPEVLQESLDRCYS 
T?SGCLEX*TDSCOP YRSAFY 1 LEQQRVGLAVDMDE I EKYQEVEE 
DQDPSCPRLSRELLDEXEPEVLQDSLGRCYSTPSGYLELPDLGQ 
P YS SAVYSLEEQYLGLALDVDR X KKDQEEEEDQGPPCPRLSREL 
LE WE PE VLQDSLDRCY ST PSS CLEQ PDS CQ PYG S S FYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDONPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEF.EH 2 S FALYVDNRFFTLTVTSLHLVFOMGVI FPQ 


6350 


3 


367* 


AGAEKCFVTLLACFLAKOONKYKYEECKDL3KSMLRNBLOFKEE 
K LAEQLKQ AEELRQY KVL.VH SQERELTQLRE KLREGR D AS RS LN 
EHLOALLTPDEPDKSQGODLQEQLAEGCRLAQHLVOKL5PENDN 
DDD2DV0VEVAEKV0KSSSPREMOKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNI K I TFEEDE VNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPOLA 
EKKQOFRNLKEKCFLTOLACFLANOONKYKYEECKDLIKFMLRN 
ERQFKEEKLAE0LKOAEELRQVKVLVHSOERELT0LREKLREGR 
DASRSI^EHLQALLTPDEPDKSOGOPLOECLAEGCRLAOHLVOK 
LS P ENDNDDD ED VC VEVAEKVQK S S APREM P KAEEKE VPEDS LE 
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SEQ 
ID 
NO: 


Predictec 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
r.uc) cot ide 
i c-cnt ion 

i responding 
tc first 
c.T.mo acid 
residue of 
amino acid 
sr- que nee 


Amano acid segment containing signal peptide" - 
{AsAlanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine , G=Glycint ( 
H-Kistidine, j = 3 sc3 cucine , K^Lysine, 
1»« Leucine, M-Kjethionine, N=Aspar£gine , 
P=Proline, O-Glut amine, R»Arginine, 
S=Serine, T-Threonine, V=Vaiine, 
W=Tryptophan, Y»Tyrosine, XeUnknown, *^£top 
Codon, /-possible nucleotide deletion, 
\>possible nucleotide insertion) 








ECA1 TCSNSHGFYDSHOPHRKTKI TFEBDKVDSTLJGSSSHVEW 
EDAVmiPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLS I PPEMLA5TYKS YSSTFHSLEEQQVCKAVDIGRKRWPgVK 
KEDHEATGPRLSKELLDEKGPEVXQDSLDRCYSTPSGCLELTDS 
COPYRSAFYVLE0OKVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLODSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
0YLGLALDVDR1 KKOyEEEEDQGPPCPRLSRELLEWEPEVI^JD 
SLDRCYSTPSSCLE0PDSCCPYGSSFYALEEKHVGFSLDVGE1E 
KKGKGKKRRGRRSKKERRRGRKEGSEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSArYILEOCRVGLAV 
DMDE1EKY0EVEED0DPSCPRLSGELLDEKEPEVL0ESLDRCVS 
TPSGCLELTDSCQPYRSAFY1LEQQRVGLAVDMDE1EKYQEVEE 
DODPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGO 
PYSS AV YS LEEQ Y LG I ALD VDR I KKDQEEE EDQGP PCPRLS REL 
bEWEPEVLQDSLDRCYSTPSSCLEOPDSCOPYGSSFYALEEKH 
VGFSLnVGElEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPFVLODSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEE EH IS FALY VDNR F FTLTVTS LHLVFOMGV 3FPQ 


6351 




319 


REARRRTERSQLGRNLWEVANGRSLVWGAEAVQALRERLGVGG 
RTVGALPRGPR0KSRLGLPLLLMPEEARLLAE1GAVTLVSAPRP 
DSRHHSLALTSFKROQEESFQEQSAIiAAEARETRROELLEKlTE 
GQAAKK0KLEQASGA-SSSQEAGSS0AAKEDF.7SDGQASGEOKEA 
G PS S S OAG PSNG VA PLP R S ALLVQLATARP R P V KAR PLD WR VQS 
KDWPHAGRPAHELRYS1YRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHY1 AQCWAPRDTI PLQDLVAAGRLGTSVRKTLLLCSPQ 
PDGKWYTSL0WASLO 


6352 


235 


923 


WSEVU.SPCHAAKCKGLSKLRITMKTRAISLAAUATEFVCGRSAP 
AMAR S h V] 1DTV FY CLS VY QVKIS PTPQLGAAS S AEGH VGOGAPG 
LMGNMN P EGGVNH E NG MNR DGGK 1 PEGGGGNG E P ROO P Q P P P E E 
PAQAAWEGPQPENMQPRTRRTKPTLLQVEELESVFRHTQrPDVP 
TRRELAS^GVTEDKVRVWFKNKRARCRRHQRELMLAKELRADP 
DDCVYIWD | 


6353 


65 


6*72 


RFACAGAI PEARARPPDVQAAEEEKEMDLPDSASRVFCGJU LSM 
WTDD\,TaAIII>AOKNMLDRFSKTNEMLLhnTWLSSARLClOMSER 
FU-JHTRTLVEMKRDLDS1 FRRIRTLKGKLAROHPEAFSH 3 PEAS 
FLEEE DEDPI PPS TTTT1 ATS EQSTGS CDTS PD7VS PSLSPG FE 
DLSHVOPGSPA3NGRSOTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFt^AFSAILPMGAIDVSDLRPVPDNOEVFC 
H P VTDQS L I VELLE L0 AH VRG EAAARYH FEDVGG VQG ARAVH VE 
SVQPLSLENLALRGRCOEAWVl^GKQQIAKENQOVAKDVTLHOA 




158 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGKGRGLLTRRPGTRRGGFSLD 
WDGKVSEIKKKIKSILPGRSCDLIjQDTSHLPPEHSDWJVGGGV 
bGLSVAYWLKKLESRRGAIRVLVVERDHTYSQASTGLSVGGlCO 
QFSLPEN I QLSLFS AS FLRN I NEYLAWDAPPLDLRFNPSG YLL 
LASEKDAAAMESNV KVQROEGAK VS LMS PDQLRJi KFPW 1 NTEGV 
ALASYGMEDEGWFDPWCLLOGLRRKVQSLGVX,FCOGEVTRF\ f SS 
SQRMLTTDDKAVVL KR I HE VHVKMDRSLEYQPVECAI V 1 NAAGA 
HSAQIAALAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPOGPGL 
ETPLVADTSGAYFR R EGLGSK YIX5GRS PTEQEEPDPANLEVDHD 
FFODKVWPHbALRYPAFETLKVQSAWAGYYDYHTFDONGWGPH 
P LWNMY FATG FSGHGLQOAPG 1G RAVAEMVLKGR FQTI DLS PF 
LFTRFYLGEKI OENN 1 3 


6356 


3S4 


633 


TGLTSSCL PLQVMM7 KRTKDKGKFSS VTVST 1 DEEEEE I EAREV 

adsyaonakviekolerkgmskrrlqelaelsakkakmkgtlid 
nqfx 1 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Amino acid secmcnt containing signal peptide 
(A^Alanme, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, Phenylalanine , G^Glycinc, 
H-Histidint, 1 -I solcucine, K=Lyr,ine, 
I.^Leucine, M=Methionine , N-Asparagine , 
P=Proline, Q=G3 utamine , R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W*Tryptophan, Y= Tyrosine, XsUnknown, *r£top 
Codon, /^possible nucleotide deletion, 
Vpocsible nucleotide insertion) 


6357 

i 


2 


91 5, 


GLLRNMALLVR V LRNQTS I SQWVP VCSRLl P VS PTQGQGDRALS 
RTSQW PQMSQSCACGG S EC; I PG 1 DI QLNR KYHTTRKLSTT KDS P 
GPVEEKVGAFTK1 IEAMGFTGPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWNCLVRMKQEGRSGKYM 
CR 1 1 VH FMW EDVCOKGR VMGVNPY I LKKNM I LMTNHF YAAI LGY 
DEG I LS DDHGLAAALWRTFFNRKCEDPRKLELLVEYVRKQ I QYL 
DSMNGEDLLL?GEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
A J S KTAVAP 3 ER VKLLLOVQHASKOI AADKOYKGI VPCI VRI PK 
EQGVLSFWRGNI.ANVI RY FPTQALNFAFKDK Y KQ I FI.GGVDKHT 
CFWRYFAGNLiASGGAAGATSLCFVYPLDFARTRLAADVGKSGTE 
REFRGLGDCLVKITKSDGIRGLY0GFSVSV0GII1YRAAYFGVY 
DTAKGMLPDPKNTHIWSWMIAQTVTAVAGWSYPFDTVRRRMM 
MQSGRKGADIMYTGTVDCWRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVj 


6359 

! 


98 


1086 


VCRQEFXKMKEDCI >PS SHV PI SDS KSIQKSELLGLLKTYNCYHE 
GKSF0LRHRFEEGTL1 3 EGLLNIAWGLRRPIRLQMQDDREOVHu 
PSTSWMPRRPSCPbKEPSPONGNlTAQGPSIOPV)lKAESSTDSS 
GPLF.EAEEAPQLMR1 KSDASCMSQRRPKCRAPGEAQRIRRHRFS 
INGilFY^KTSVFTPAYGSVTWRVNSTKTTLQVLTLLLNKFRV 
EDGPSEFALY1 VHESGERTKLKDCEYPLISR1LHGPCEK1ARIF 
LMEADLGVEVPHEVTvOYI KFEMPVLDSFVEKLKEEEEREI I KLT 
MKFQALRLTMLQRLEQLVEAK 


| 6360 


1 


345 


GTRGAVPSTLEEVVLPPRSCRVFW1HSGTTMSKVSFKITLTSDF 
RLPYKVLSVPESTPbTAVLKFAAEEFKVPAATSAIITNDGlGIN 
PAQTAGNVFLKJJGSELR1 IPRDRVGSC 


6361 


61 h 


i5e 


HPGIX50LOHCALAP0AGNRRCRFHGRLHALTRSTHRGKPMSIM0 
FKDTLNTPLPDSS PVAVPLGAP1 AVASTLSVE11NDGVETGI WAC 
APGRWRRQI TSQEFCH F I QGRCTFTPDDGETLH I QAGDALMLPA 
NSTG I WD I QETVR KTYVL 3 L 


6362 


350 


1576 


TTKDGSHS AALKLQOL P PTSSSS AVSEAS FS YKENL1 GALL.A J F 
GHLWSIAlALOXYCHIRIoAGSKDPRAYFKTKTWWLGLFLMLLG 
ELGVFASYAFAPLSLIVPLSAVSVIASA1IG1 1 FI XEKWKPKDF 
LRRYVLSFVGCG1^VVGTYLLVTFAPNSHEKMTGENVTRHI.VSW 
PFXLYMLVE 1 1 L FCLLL YFY XEKNANN I WI LLLVALIiGSMT W 
TVKAVAGMLVLS 1 QGNLQLDYPI FYVMFVCMVATAVYQAAFLSQ 
ASQM YDS S LI AS VG Y 3 LS IT I AI TAGAI FYLDF I G EDVLH I CMF 
ALGCLIAFLGVFLITRNRKXPIPFEPYISMDAMPGMQNMHDKGM 
TV0PELKASFSYGA1.ENNDNISE1YAPATLPVM0EEHGSRSASG 
VPYRVLEHTKKE 


6363 


23 


1201 


RRTRLGSSFPRRRDSSAMESYDVIAN0PWIDNGSGV1KAGFAG 
D03 P KYCFPNYVGR P KHVRVMAGALEGDI FIGP KAEEHRGLLS I 
R Y PM EHG I VKDWNDM ER I WQ Y VYS KDQLQTFS E EH P VLLTE A PL 
N PRKKR ERAAE VF FETFNVPA LFI S MQAVLSL YATGRTTG W LD 
SGDGVTHAVPI YEGF AMPHS I MRID I AGRDVSRFLRLYLRJCEGY 
DFHSSSEFEIVKAIKERACYLSINPQKDETLETEKAQYYLPDGS 
TJEIGPSRFRAPELLFRPDLIGEESKGIHEVLVFAIOKSDWDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPOE 
RLYSTWIGGSILASLDTFKKMWVSKKEYEEDGARSIHRKTF 


636*4 


21 


1201 


RRTRLGSS FPRRRDS3AMESYPVI ANQPWI DNGSGVI KAGFAG 
DQ1PK YCF PNYVGRPKHVRVMAGALEGD I FIGPKAEEHRGLLS I 
RYP^uSHGIVKDWNDMERlWQY^/YSKDQLQTFSEEHPVLLTEAPL 
NP^KNRERAAEVFFETFNVPALFISMOAVLSLYATGRTTGWLD 
SGDGVTHAVPI YEG FAMPHS I MRI DIAGRDVSRFLRLYLRKEGY 
DFHSSS EFE I VKAI KERACYLS I NPQKDETLETEKAQ Y YLPDGS 
TIElGPSRFRAPELLrRPDLIGEESEGIHEVLVFAIOKSDMDLR 
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BNSDOCID: <WO 0153312*1.L> 
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SEQ 
ID 

NO: 


Preca ct cc 
beginning 
nucleot id«- 
lccation 
corresponding 
to first 
amino acic 
residue of 
amino aciri 
sequence 


Predicted ('rid 
nucleotide 
3 oc:r. t ion 
corresponding 
to first 
amino acid 
residue cf 
amino a~ic 
sequence 


Amine ?.cid sectnent containing signal peptice 
(A> Alanine, C-Cysteine, D^Aspartic Acid, £.= 
Clutarnic Acid, F-Phenylalanine, G=Glyeine, 
H^Histidjne, I =1 soleucine, K^Lysine, 
b=beucine, M=Met hionine , N=Asparag:ine , 
P=Prcline, Q=G3 utamine, R^Arginine, 
S^Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y- Tyrosine, X-Ur.known, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTbrSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKlRISAPOE 
RLY STW 3 GGS I LASbDTFKKMKVS KXE YEEDGARS 1 HR KTF 


6365 


234 


1995 


KHKS RAS CAARAQA f G PS R ER EVH S R FRSGbR RLGESN SGCCTM 
ASMGTLAFDEYGRPFL1 1 KDOERKSRI AJGLEAbKSHIN.AAKAVA 
NTNRTSbGPNGI.DKKMVDKDGDVTVTNDGATILSMMDVDHQIAK 
LMVELS KSC'DDE I GCGTTGVWLAGALbEEAEOLbDRG I HP I R I 
ADGYEQAARVAIEHbDKl SDSVLVDI KDTEP b 3 QT AKTTLG S KV 
VKS CH ^ OMA E I AVN AVLTVAJDKER R DVDFEL1 KVEGKVGG R bED 
TKba KGVI VDKDFSHPQMPKKVEDAK1 Al bTCPFF.PPKPKTJOIK ! 
bDVTSVEDY KAbOKYF.KEKFEEMlOOlKETGANbAlCOWGFDDE 
ANHLbLONTCbPAVRWVGGPEIEblAIATGGRlVPRFSELTAEKb 
GFAGbVOElSFGTTKDKMbVIEOCKNSRAVTIFIRGGNKWIIEE 
AKR S bHDAbCVIRWI .1 RDNRWYGGGAAEISCAbAVSQEADKCP j 
TbECYAMRAFADAbEVl PMAbSENSGM>TPICTMTEVRARQVKEM 
NPAI »G I DCbKKGTNDMKQQHVl ETLI GKKQQ I SbATOMVRM I bK 
1CD1RKPGISEE 


6366 


257 


1 698 


GNKEGAHSSTFWVbbSlFbGAVAMbCKEOGITVLGbNAVFDlLV 
3GKFKVLE2 VQKVLH K DKSbENbGMbRNGGbbFR MTbbTS GGAG 
MbYVR VJR J MGTGPP AFTEVDN PAS FADSMbVRAVN YNYY Y SbNA 
WbLbCPWWl.CFDWSMGCIPblKSISDWRVIAbAAbKFCI.lGblC 
OAbCS EDGKKRR I bTbGbGFbVI PFbPASNbFFRVGFWAERVb 
YbFSVGYCVbbTFGFGAbSKHTKKKKblAAWbGIbFlNTLRCV 
bRSGENRSEEObFRSAbSVCPbNAKVHYNIGKNbADKGNOTAAl 
R YY RE A VRbN PK Y VHAMNNbGN I LX E RN EbQEAEE bLS bAVQI 0 
PDFAAAWMNLGIVQNSbKRFEAAEQSYRTAIKHRRKYPDCYYNb 
GRbYADbNRHVDAbNAWRNATVbKFEHSbAWNNMI I LbPNTGNb 
A0AEAVGREAbELIPNDHSU4FSLANVLGKS0KYKESEAbFbKA 
I KAKPNAASYHGNbAVLYHRWGHbDbAKKHYEl SbQbDPTASGT 
KENYGLLRRKbELMOKKAV 


6367 


287 


1934 


SJGFPVMbV b£ J LL Y TC EM FQDS V A FED VAVS F TO E E W AbLD PS 
OKNLYRDVy.QETFKNbTSVGKTWKVONIEDEYKNPRRNb^LMRE 
KbCFSKESHHCGESFNQIADDMbNRKTbPGITPCESSVCGEVGT 
GHSSLNTH3 RADTGHKSSBYOEYGENPYRNKECKKAFSYbDSFQ 
SHDKACTKEKPYDGKE'CTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFbNbCblHERIHTGVKPYKCKQCGXAFTRSTTbPVHER 
TH1Y3VNADECKECGNAFSFPSEIRRHKRSHTGEKPYECKOCGKV 
FISFSSIQYUKMTHTGEKPYECKQCGKAFRCGSHbOKHGRTHTG 
EKPyECROCGKAFRCTSDLORHEKTKTEDKPYGCKOCGKGFRCA 
SQbQj HERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKOCGKAFRYFSSbHIHERTKTGDKPYECKVCGKAFTCSSSIR 
YHER'J'HTGEKPYECKHCGKAFISNYIRYHERTHTGEXPYOCKOC 
GKAF3 RASSCREHERTHTINR 


6368 


1 


327 


RPVPAKbNPRSWPRTAGAbPbRPPPbTMAVFHDEVEIEDFQYDE 
DSETYFYPCPCGDNFSITKBDLENGEDVATCPSCSbl I KV3 YDK 
DQFVCGETVPAPSANKEbVKC 


6363 


a 


1745 


AGCCR DTR F PTPRGPGSbCHN FCRS AACTVTRT I HGS PREDTGT 
PRSREMMFODSVAFEDVAVSFTOEEWALbDPSOKNbYRDVMOET 
FKNbTSVGKTWKVONIEDEYKNPRRNbSbMREKbCESKESHHCG 
ESFNOlADDKLNRKTbPGITPCESSVCGEVGTGHSSLNTHIRAD 
7GHKSSEY0EYGENPYRNKECKKAFSYLDSF0SHDKAC7KEKPY 
DGKECTETFISHSCIORHRVMHSGDGPYKCKFCGKAFYFLNbCb 
3 HER 1 HTGVX PYKCKOCGKAPTRSTTbPVHERTHTGVW ADSCKE 
CGNAFSFPSE1RRHXRSHTGEKPYECKQCGKVFTSFSSIQYHKM 
THTGEKPYECKO^GKAF^CGSHl^KHGRTHTGEKPYECROCGKA 
FRCTSDbQRHEKTHTEDXPYC-CXQCGKGFRCASObOIHERTHSG 
EKPHECJCECGKVFKYFSSLRIHERTHTGEXPHECKOCGKAFRYF 
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SEO 
ID 
NO: 



6370 



6371 



6372 



6373 



6374 



6375 



Predictcc 
beg- nninc 
nucleotide 
location 
correspond 
tc first 
amino acid 
residue of 
amino acic 
secuence 



Predicted eno 
nucleotide 
location 
corresponding 
to first 
anuno acid 
residue oi 
amino acid 
sequence 



1711 



214 2 



67 



S3b 



288 



711 



2105 



1535 



;-.in:no aciu segment containing sianal pept ide 
(A-AJanine, C=Cysteine, n^A:;partic Acid, E= 
Glutamic Acid, ? =Phenylaiar.ine, G-Glyci.ne, 
K=Histidine, 3-1 soleucine , K=Lysine, 
L= Leucine, N=Met hicnine, N=Asparaaine , 
P- Proline, G>Gluiami ne, R=Argmine, 
S=Serine, T=*Thr eonine , V^Valme, 
W= Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Ccdon, /=possible nucleoside deletion, 
\-poseible nucleotide insertion) 



S S ]Jl I H ERTHTGDKP Y ECK VCGKA F J'CSS S I RYKERTHTGEKPY 
ECKHCGKAFISNYIRYKERTHTGFKPYQCKQCGKAFIRASSCRE 
HERTHTINR 



329 I FVLPEgRLRTERTWPRSPGLGRGAAAAGARTAGAGLLRLLl/XXS 

AbVGGtRPVTMTTPANACNASKTWELSLYELHRTPOEAIMDGTE 
3AVSPRSLHSEIjKCP2CLDKLKN114TTKECLHRFCSDCIVTALR 
i SGNKECPTCRXKLVSKRSLRPDPNFDALI^KIYPSREEYEAHQD 
j RVLI RLSRLHNQQALSSS 1 EEGLRKOAKKRAQR VRRPI PGSDQT 
I TTMSGGEGEPGEGEGBGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
I rsVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTlX3FPSPPGAPS 
j PPFPGGE1ELVFRPHPLLVEKGEYCOTRYVKTTGNATVDHLSKY 
IALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
I DGPEHPALPSLEGVSEKQYTIYJAPGGGAFTTLNGSLTLELVNE 
I KFWXVSRPLELCYAPTKDPK 



GVANMSTAMN FGTKS FQPR PPDKGS FPLDK LGECKSFKEK7MKC 
1. HhTNNFENALCRKE SKEY LECRME R X LMI,QE P LS KLG FGDLTSG 
KS EAXK 



R VChl ASEGKAEGRYXKLEDLLEXS FS LVKMPSLOPWMCVMKH 
l.PKVPEKKLKLVMADKELYRACAVEVKRQIWODNQALFGDEVSP 
LLKOYILEKESALFSTLLSVLHNFFSPSPKTRROGEWQRLTRM 
VG KNVKXY UM VLQFLR T LFLRTRN VH Y CT1 RAELLMSLHDLDVG 
E 1 CTVDPCH K FTWCLDAC I R ER F VE)V KRAR E LQC FLDG VKKGQE 
OVLGDLSMILCDPFAINTLALSTVRHLOELVGOETLPRDSPDLL 
LL1 K LLALGQGAWDK I DS QVFXEPXMEVEL1 tr FLPMLMS FLVD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLOEQRMACEVGliY 
YVLH I TK0RNKNALLRLI.PGLVETFGDLAFGD1 FLHLLTGNLAL 
L^.DEFALEDFCSSLFDGFFLTAS PR KENVHRHALRLL3HLHPRV 
AFSKLEAL0KALEPTG0SGEAVKELYSOLGEKLEOLDHRKPSPA 

OAAE T PALE LPL? S VPA PA Ph 

P^KAARASPARLPAMVSVJl 1 3RLWL1 FGTLVPAYYSYKAVKSK 
Dl KEYVKWKMYWI I FALrTTAETFTDl FbCWFPFYYELKIAFVA 
WL LS P YTKGSSLLYR KFV'H PTLSS K E KE I DDCLVQAKDR S YDAL 
VHFC- XRGLNVAATAAVNAAS KGQGALS ERLRS FS MQDLTT IRGD 
GAPAPSGPPPPGSGRASGXKGQPXNSRSASESASSSGT7A 



HKLFCSYfSTSEFPSSTKHHSCPTK'lFCNYTSSTIFLSSTRDHS 
CFTHTFCNYTSSTIFLSSTRDHSCPTHTSCNYTSSTIFLSSTRD 
HSCFTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRP3IRLSSC 
CPAELQTEGSNGXXEVLSGFOVVLEDTVLFPEGGGQPDDRGTIN 
Dl S VLRVTRRGEQADHFTOTPLDPG SQVLVRVDWERRFDHMQQH 
SGOHLI TAVADHLFKL.KTTS WELGRFRSAI ELDTPSMTAEQVAA 
1E0SVNEK1RDRLPVNVRELSLDDPEVE0VSGRGLPDDHAGP1R 
VW] EGVDSNMCCGTHVSNLSDLQV1 K3LGTEKGKKNRTNLIFL 
SGNR\OiKWMERSHGTEKALTALLKCGAED}r/EAVK K IiQNSTKI h 
OKNNLNLLRTLA VH I AH£ LRNS PDWGGW I LHRKEGDSEFMN 1 1 
ANE 1 GSE2TLLFLTVGDEKGGGLFLLAGPPASVETLGPRVAEVL 
EGXGAGKXGRFQGXATXKSRRMEAQALLQDYISTQSAKS 



AlMAJiATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRFRRPGQQWRGPTMLVTAYLAFVGLLASCL<51iELSRCRAK 
PPGRACSNPS FLR FQLDF YQ VYFliALAADWLQAP YLY KLYQHYY 
FLEGO 3 A I L YV CGLAST VL FGLVAS S LVDWLGRXNS CVL FSLT Y 
SLCCLTKLSOBYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDF PAE W I PATFARAAFWNHVLAVVAG VAAEAVAS W IGLGPVAP 
FVAAI PLLAIAGALA^Rr^GENYDRCRAFSRTCAGGLRCLLSDR 
RVLLLGTIOALFESVI FI FVFLWTPVLDPHGAPLG3 3FSSFKAA 
SLiyGSSLYRIATSXRYHLCPMHLLSLAVLIWFSLFMLTFSTSP 
GOE S P VES F IAFLL IELACG L Y FPSMS FLRRKVI P ETEQAGVLN 
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BNSDOClD <WO 0l533^2At_t. > 



WO 01/53312 
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3D 
NO: 


Predicted 
beg inninc 
nucleet i ae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucj eot i dr 
1 oca trior, 
correspond; ng 
to rirst 
amino scic 
residue o! 
amino acic 
sequence 


Amino acic segment conteaning signal pept i ae 
(A-Alanine, C=Cycteine, D=Aspart i c Acid, K- 
Glutamic Acid, F=- Phenyloianmp , G=Glycint. 
!NHistidine, I = l?>o]cucine, K=Lysine. 
L~ Leucine, MrMethionine, N=Asparagme, 
?=?roline, Or-GlutF-nune, F<=Arginine, 
S-Serine, T=Threonme, V=Valine, 
W=Tryptophan, y=Tyrosinc, X=Unknown, *=Stop 
Codon, /-poscible nucleotide deletion, 
\=possible nucleotide insertion) 






1 W ?RV P LHSLAC LGLLVLHDS DRKTGTRNMFSI CSAVMVMALLAV 
( VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


JSSTDJDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
ODLLIAALGMKLGSPKSSVTlWOPLKLFAySQLTSLVRRATLKE 
NEC; I PKVEKIHNFKVHTFRGPHWCEYCANFMWGLIAOGVKCADC 
G LMVHKQCS KMVPNDCK FD L Ktf V K KVY SCDLTTLVKAHTT KRPM 
WDMCIRElESRGhNSEGLyRVSGFSDLIEDVKMAFDRDGEKAD 
ISVNMYEDIN1ITGALKLYFRDLP1PL1TYDAYPKFIESAK2MD 
PDEOLETLHEALKLLPPAHCF.Tl^YLMAHLKRVTLHEKENLMNA 
ENLGIVFGPTLXRSPELDAMAALNDIRYQRLWELLIKNED1LF 


6377 


2331 


1841- 


SRI RRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE 
ORVEDVRLI REOHPTK1 PV3 J ERYKGEKQLPVLDKTKFLVFDHV 
NMSEL2K1 1 RRRLQLNANQAFFLLVNGHSMVSVSTPI SEVYESE 
XDEDGFLYMVYASQETFGMKLSY ! 


6 37F" 


6U6 


191 


GAGPWEAFPDGIGRRSRRARLPOYKRPPGRVGGGDSGRRNMAVA 

WAEY}IAD1YDKVSGDMOKC>GCDCECLGGGR1SHOSOI>XK1)IVYG 
YSMAYGPAQHAl STEKI KAKYPDYEVTWANDGY 


6:r/.s 


35 


378 


EJLAGSPSPSRAALRRCAPQRSQAPRWPDRAACRRSFCK3SQGRAY 
LFNS WNVGCG P AEERVLLTGLHAVADI YCENCKTTLGW K YEHA 
FESSQXYKEGKY 1 1 ELAHM1 KDNGWI) 


638C 


1414 


462 


PAV0GQRGAGPP iXJRGSGNr4ARFALTVVRHGETRFNKEKI I OGQ 
G VDK PLS ETG FKQAAAAG I FLNNVK FTHAFS SDLMRTKQTMHG J 
LERSKFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CPV FTPPGC;PTL.DOVKMttGl DFFF F t .COL 3 LKEADGKEOFSOG^ 
PSNCLETSLAEI FPLGXNHS5KVNSDSGI PGLAASVLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELM5VTPNTGMSLFIINFEEG 
REVKPTVQCICMNLQDHLNGLTENSLGLNLPSKSNHFEPhKGVP 
I.ALFTSLLC 


6361 


1668 


21 £ 


AVVRAQGSRGFSGAGWRPRQAAAMNFSEVFKLSSLLCKFSPDGK 
YIASCVOYRLWRDVNTLOI LQLYTCLD010HI EWSADSLFI LC 
AMY KRGLVQ VWS LEQPEWH CK I DEGS AGLVASCWSPDGRHI LNT 
TEFHLRITVWSLCTKSVS Y I KYPKACLQGITFTRDGRYMALAER 
RDCKD YVS 1 F VCSDWQLLRH FDTDTODLTG I EWAPNGCVLAVWD 
TCLEYKILLYSLDGRLLS7YSAYEWSU31KSVAWSPSSQFI*AVG 

SYDGKVR1LNHVTWKM1TEFGHPAA1JJDPK1VVYKEAEKSP0LG 
LGCLS FP PPRAG AG PL PSSESKY EI AS VP VSLQTLKPVTDRANP 
XI G I GMLAFS PDS Y FLATRNDN1 FKAVWVWDI QKLRLFAVLEQL 
S PVRAFQWDPOQ PRIAI CTGGSRLYLWS PAGCMS VQVPGEGDFA 
VLSLCWHLSGDSMALLSKDH FCLC FLETEAWGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCL1AYPLKGDHGIVDIVDNSDCEPKSKLLRWTTNK 
KHHVLETEKTPKDWVROHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGNISSQLKHYNPKSKKCHOOQLORMKENAXHRNOyKFlL 
LE NLT S RY EV P CV LDLKMGT R QHGDD AS ESKAANQ I R KCQQSTS 
AV 1 GVR VCGMQVYQAGSGOLMFMNK YHG RKLS VQG FKEALFQFF 
HNGRYLRRELLGPVLKKLTELKAVLEROESYRFYSSSLLVIYDG 
KERPEVVLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 
1DFAIITTCRLYGEDTVVHRG0DAGYIFGL0SLJD1VTEISEESG 
E 


6383 


3159 


1061 


S PAPGR PS P HGSG" P AARAAAAP AM PS AXQRG S KGGHG AAS PS EK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLASOP 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAXK 
PP P APQQP PPPP APH POQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSKS AAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLEE VQQVKRSHQDFS ROR EELGQG LCCVEQKVQSLQA 
TFGrFESILRSSOKKODLTEKAVKOGESEVSRlSEVLOKU)NEl 
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SEC- 
ID 
NC: 


Predicted 
beginning 
nucleotide 
i oca t ion 
cox re spending 
to first 
amano acid 
residue of 
amino acid 
sequence 


Predict e-c ena 
nucleot jd( 
Jocatior. 
coriespcridinc 
to farsi 
amino acy: 
residue c:' 
amino ocic 
sequence 


Amino acad segment containing sacnal peptide 
(A*A3anine, C--Cysteine, I>=rA5partic Acid, E~ 
Glutamic Acid, F=Phenyla}anine, G-Glycirie, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methiomne, N=Asparagine, 
P^Proline, Q=Glutanine , R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 









HWKDARERCFTSLENTVEEKI.TELTKSINDNIAI F 
TEVQKRSQKElNDMKAKVASLEESEGNKQDLKAIiKEAVKEIOTS 
AKSREWD^3EALRS7LOTMESDJyTEVKELVSLKOEQOAFKEAAI> 
TERUULOALTEKLLRSEESVSRLPEEIRRLEEELROLKSDSHGP 
KEDGG FR H S EAFEALQQKSQG LDS RLQH VEDGVLSMQVASARCT 
ES LES L LS KSQEKEQR LAALOG RLEGLGS S EADODGLASTVK S L 
GETOLVLyGDVEELKRSVGELPSTVESLQKVGEOVHTbLSODQA 
QAARLPPODFLDRLSSLDNLKASVSQVEADLKMLRTAVDSLVAY 
SVKIETNENNLESAKGliLDDLRNDLDRLFVKVEKIHEKV 


6384 


738" 




IWEVPVCLTHLLHLOOANQPLPPPSSS1NEEDADEANRA1GEKR 
AAPDSGKKP KTPKTKQOKDPNE PQXPVS AYALPFRDTQAAl KGQ 
NPNATFGEVSQIVASMWDSUJEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVS KAAAESAEAOTI RSVQOTIASTNLTSS LLLNTPLSO 
HGTVSASF0TL0CSLPRS1AFKPLTMRLPMNQIVTSVTIAANMP 
SN I GAPLI5 SMGTTMVG SAPSTOVSPS VQTQQHQHQWQQQQQQ 
000MO0MO0OOLQ0H0MH0Q 3 QWMQQOHFQHHMQQHLOQOQOH 
l.QGQ 1 N0O0LOQ0 U»R hOWQ LQHMQHQSQ PS PRQHS PVASO I 
TS P I PA I G SVQVPSQQHQSQ1 OSQTQTQVLSQVS I F 


638h 


2 


lse* 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LA0GPDAATTDELSSLGSDSEANGFAERR3 DKFGFI VGSCGAEG 
Al.EEVPLEV LRCRESK W LDMUWWD KWMAKKH KXI RLRCQKG .1 P 
PSliRGRAWCYIiSGGKV'KJjQONPGKFDELDMSPGDPKWLOVIERD 
LHROFPFHEMFVSRGGHGOODLFRVLKAYTLYRPEEGYCOAOAP 
lAAVLbMKMPAEOAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWPMCAFSRTLPWSSVL 
RWDMFFCEGVX1 1 FRVGLVLLK1IALGSPEKVKACC?G0YETI ER 
LRSLSPKI MOE AFLVQE WELPVTERQ IEREHL3 OLRRWQET RG 
ELOCRSPFRL1IGAKAI LDAEPGPRPALQPSPS 1 RL.PLDAPLPGS 
KAKPXPPKQAQKEQRKQMKGRGQLEXPPAPNOAMVVAAAGDACP 
P0HVPPKDSAPXDSAP0DLAP0VSAHHRSQESLTSOESEDTYL 


6386 


819 


195 


TVCGSFYIGI MORASRLKREUiMLATEPPPGJ 7CWQDKD0MDDL 
RAQI IjGGANTPYEKGVFXLEVI 1 PERYPFEPPQIRFLTPI YHPN 

i dsagri cldvlkxppkgawrpslni atvlts i ql.lmsepnpdd 
plmadissefkynkpaflknarowtexharoxoxajdeeemldnl 
peagdsrvhnstqkrkasqlvg1 ekkfhpdv 


63 87 


1 


66; 


PGPTHASAX)AWADAWA0PN;4AMHNKAAPPQ1PDTRRELAELVKR 

nsns kndrrkr kfk eafrlfskssvts aaavs alagvqdol3 ex 
repgsgtesdtspdfhnqenepsqedpedldgsvqgvkpqkaas 

S TS SG S HH S SH KKR KN KNR H S PS GM FDY DFE I DLKJLNKK P RAD Y 


6388 


"1 


662 


pgpthasadawadawaopnhamhnkaappoipdtrrelaelvkr 

XQEUAETLANLEROI YAFEGSYLEDTQMYGNI 1 RGWDRYJbTKQK 
NSNEKNDRR^RKFKEAERLFSKSSVTSAAAVSALAGVODOLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGS HHS SH KKR KNKNRHS PSGMFDYDFE 1 DLKLNKKPRAD Y 


6389 


1074 


4 97 


AEPGDRMAGHRL VLV LGDLH I FHRCNSLPAXFKKLliVPGKI QH I 
LCTGNLCTXES YD YLKTLAGDVH I VRGDFDEKLN Y P EQKWTVG 
QFKIGLIHGHOVIPWGDMASLAL,LOROFDVD1LISGHTHKFEAF 
EHENKFY1NPGSATGAYNALETNI I PSFVLMDIQASTWTYVYQ 
L1GDDVKVER IEYXKP 


6390 


1S8 


535 


GEBRKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWOGSGSHG 
LTIAQRDDGVFVOEVTONSPAARTGWXEGDOIVGATIYFDNLO 
SGEVTQLLNTMGHHTVGLKLHRKGDRPFPSLGQTWDP 


6391 


5386 


2897 


VRWNSXTECYLSIOTOENFPANLNELVNCIVISStVTTORXLKA 
MSLU3SRN0IARAVLNPNPKDFCTKDLLTTTSER1IAYLRDFNE 
DOXXA1ETAYAMVKHSPSVAKICLIHGPPGTGKSKTIVGLLYRL 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nocleot i de 
locat ion 
correpponding 
to first 
amino acid 
residue of 
amino acid 


Amino acric st?ment containing signal pectioc 
(A=Alamne, ^Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyla} anine , G^Glycine, 
H-Histidine, I=Isoleucine , K^Lysine, 
l^Leucine, M=Methionine , N=Aspar agine , 
P=Proline / C=Giutamine, R=Arginine, 
S=Serine, T= Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *<=Stop 
Codcn, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








LTENQRKGHSDENSNAKI KQNRVLVCAPSNAAVDELMKKI I bEF 
KEKCKDKKNPLGNCGDINLVRUSPEKSINSEVT^KFSLDSQVNHR 
MKKLELPSHVOAMHKRKEFLDyO^DELSRORALCRGGRElOROEL 
DENISK^S"<EROFbA£KTKEVOGRPOKTOSt 7 TLFQHT TCfTL? 
TSCGbLLESAFKGQGGVPFSCVIVDEAGQSCElETbTPblHRCN 
KLILVGDPKOLPPTVlSMKAQEYGyDQSMMARFCRLLESNVEKN 
MISRLPILQLTVOYnMHPDlCLFPSNyVYNRNLKTNROTEAlRC 
SSDWPFOPYLVFDVGUGSERRDNDSYINVQEIKLVME2IKLIKD 
KRKDVSFRNlGIITHYKAOKTMIOKDIiDKEFDRKGPAEVD'IVDA 
r vv< r\ v •VL' v v i v ] v, v rt/yvol Uvjo lor LrtoliyAui' v J J I t\r\ r\ I our 
ILGHLRTLMEN0HWNOL1ODAQKRGAI 1 KTCDKNYRHDAVKl LK 
bKPVbQRSbT}lPPTIAPEGSRPC^GbPSSKLDSGPAKTSVAASL 
yHTPSDSKEITLTVTSKDPERPPVHD0L0DPRLLKR^5GIEVKGG 
I FbWDPQ P S S POH PGATP PTGEPG FPWH QDZ,S HVQQPAAVVAA 
LSSHKPPVRGEPFAASPEASTCOSKCDDPEEELCHRREARAESE 
GEOEKCGSETHHTRRNSRliDKRTLEOEDSSSKKRKLL 


6392 


912 


186 


GRTGVDLASSMAHKL01RbL»TWDVKDTLL»RLRHPLGEAYATKAR 
AHGLEVEPSALEOGFROAYRAQSHSFPNYGLSHGLTSROWWI.DV 
V LQTFH b AG VQDAQA VAP I A EQLY KDFS H PCTWQ VLDGA E DTLR 
ECRTRGLh LAVISNr DRRLEG1 Ia5GLGLREHFDFVLTSEAAGWP 
KPDPRIFOEALRUO-IMEPWAAHVGDNYLCDYOGPRAVGMHSFL 
WGPOALDrWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 


6393 


201? 


73 0 


TGGS X MAAVATCGS VAAS TGS AVATAS KSNV7S PQRRGPRASVT 
NDSG? RLVS 1 AGTR PSVRNGObl.VSTGUPAuDO^LGGGLAVGTV 
LLIEEDKYTJIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILO 
ELPAPbbDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQbbPKME 
IGPVSSSRFGH YYDASKKhPyEIjl EASNWHGr r uPEKIbia 1 t»fW 
EPCSLTPGYTKLbQFlQNIIYEEGFDGSNPQKKQRNlLRIGIQN 
bGSPbWGDD] CCAENGGNSJJSLTKFLYVLRGLLRTSLSAC1 ITK 
PTHLI QNKAI I ARVTTLSDWVGLESr IGSERErNPLYKDYHGL 
I H I RQ I PR LNML I CTJESDVKDbAFKbKRKb FT I E RbtfLP P DLSD 
T V SR S S KMD1AES AKRbG PGCGMMAGG K KHLD F 


6394 


1418 


5li 


GAAAGGEGARR R PAAMATVMAATAAERAVbEEE FRWbLHDEVHA 
VbKQbQDI LKEASbRFTbPGSGTEGPAKOENFl LGSCGTDQVKG 
VbTbOGDALSOAlJVKLKMPRlWObbHFAFREDKQWKLQOIODAR 
NHVSOAlYLLTSRDOSYOFKTGAEVLKbMDAVMLDbTRARWRbT 
TPATbTLPElAASGLTRMFAPAbPSDbbVNVyiNbNKbCbTVTQ 
bHAXOPNSTKNFRPAGGAVbHSPGAMFEWGSORbEVSHVHKVEC 
VIPWbNDALVYFTVSLQbCQOLKDKISVFSSYWSYRPF 


6395 


13 


65* 


PSGRPTRPbCXlAARRGAARKGGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFlQEWAbLDSARRSbCKYRMbDOCRTLASRGTP 
PCKPSCVSOLGQRAEPKATERGIbRATGVAWESObKPEELPSMQ 
DLbEE ASSRDMQMGPGLFbRMOLVF S I EERETPLTREDRPALQE 
PPWSLGCTGLKAAMOIORWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANILSS PSKRGQ KGTL IG YS PEGTPLYNFMGDAFQHSSOS 1 PRF 
I KESbXQI bEESDSRQl FYFLCLNLLFTFVELFYGVLTNSLGU 
SDGFHMLFDCSAXV»>JGLFAAbMSRWKATRI FSYGYGR1 E1LSGF 
INGLFLI VI AFFVFMESVARb I DPPELDTHMLTPVS VGGbl VNL 
IGICAFSHAHSHAHGASOGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMN ANNRGV FLHVbADTLGS I G V 1 VST Vbl EQFGWF1AD 
PLCSLFI A 1 1_> J FLS WPL I KDACQ VbbLRLPPE YEJCELH IALEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEOR JVOOV 
TGI LKDAGVNNLT 1 QVSKEAY FQHMSGLSTGFHDV1JVMTKQME S 
MKYCKDGTYIM 


6397 


391 


122 


GAGGVGRFE AI RAPARMI E WCNDRbGKKVRVKCNTPDTI GDbK 
KLIAAQTGTRVWKIVLKKWYTIFKDHVSlJGDYElHDGMNbEbYY 
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SEC 
IP 
NO; 


Predict ed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl ect rde 
location 
ccr responding 
to first 
amino scic 
residue of 
amino acid 
Sequence 


>.rnino cCid segment containing signal peptTce 
(A = Al5r.2ne, Cysteine, D=Aspartic Acid, c-- 
Giutamic Acid, P^Pnenyl alanine , G=Glycine, 
H=Histidine, 1= Isoleucine , K=Lysine, 
L-l>eucine, M=Kethicnine , N=Asparagint , 
P-Proline, 0=Glutamine, R=Arcinine, 
S=Serine, T=Threonane, V=Valir.e, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *xStop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








G 


"6398 


3S2 


1306 


HKQMGPI.l NRCKKIbLPTTVPPATMRI WLL6G1>LP FLLLLSGbC 1 
RPTEGSEVA I KI DPCFAPGS FDDQYOGCSK0VMEKLT0C-DYFTK 
Dl E AQKNY FRMWOKAHlAWbNOGKVLPONMTTTHAVAl LFYTLN 
SNVHSDFTKAMAS VARTPQQYERS FHFKYLHYYLTSAI QLLR K C 
SIMENGTLCVEVHYRTKDVHp-NAYTGATIRFGQFLSTSLLKBEA 
OEFGNQTLFTIFTCLGAPVOYFSLKKEVIjIPPYELFKVINMSYH 
PRGDWuQI.RSTGNLSTYNCQLLKASSKKCIPDPIAIASLSFLTS 
V11FSKSRV 


6399 


lb 


1245 


PNLETYFGRKCEKDSMNFTPTHTPVCRXRTWSKRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEVYOXXMAECEAEN 
EDLLKKLELYXEACEG0HKLECDL-OOREEE1AELQKALSDM0VC 
LFOEREHVLRLYSENDRLH I RELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVT1LQKTI0AVGECE0SESSAFKADPK1SKRRPSR 
EPKESSEHYCRDIQTLlLOVEALOAObGEOTKLSREOlEGblED 
RR1}JLEEI0V0HQRN0NK1KELTKNLHHTQELLYESTKDKLQLR 
SENONKEKSWMLEKDNLMSK I KQYR VQCKKKEDK3GKVLPVMHE 

Pnisn\jsr, J J l\v no v v i c v c«j 1 r r v ri 


6400 


2S20 


10S3 


KTKKCDEVVYEV0SAILRJ1NCGYAKKTGKFFHNLMERKDFETWL 
DN3 SVTFLSLTDLQKNETLDJ1L1 SLSGAVQLRKLSNNLE7LLKH 
DFL KJLLPbKUJ FYLLKWLDPQTLLTCCLVSKOKNXV 1 S ACT E VVi 
OTACKNLGWQTDDSVODALHWKKVYIjKAILRMKQLEDHEAFKI-S 
Rl.irHCJlPWRI YV KfY^T.T.rTnQnni RK T .wnvQTnfyvTYfi T OT 

Z> L> 1 uJlOrt/* V I r\ Jj J Z AXAiUijV- J VJ01^4^UOAVrVJ_»*»U VO 1 Uv^- u J U i y i 

HTCAAVKFDEOKtiVTGSFDNTVACWEWSSGARTOHFRGKTlGAVr 
SVDYNDELDlbVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLOKCKVKSLLHSPGDYILLSADKYEIKIWPIGREINCKCLKTL 
SVSEDRS 1 CLQPRLHFDGKYIVCSSALGLYQWDFASYDILRVI K 
TPEI ANLALLG FGD I FAL»1>FDNRYL»Y I MDLRTESLI S RWPLPE Y 
RKSKRGSS r 1AGEASWLNGLDGHNDTGLVFATSKPDHS 1 KLVLW 
K3HG 


6401 


109 


766 


PGAAWSRPDLRGCCTGPOPAl-RMLVLPSPCPQPLAFSSVE'rKEG 
PPKRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTOGVPYTVL 
VDEESOREPGASGAPG0KKCYSCPVCSRVFEYMSYL0RHS1THS 
EVKFFECDJCGKAFKRASHLARHHSIHLAGGGRrHGCPLCPRRF 
RDAGEIAQHSRVHSGERPFOCPHCPRRFMEQNTIiQKHTRWKHP 


6402 


1196 


279 


TTS0CGGI ROSSAI PVASMEFAAICLRNALLLLPEEOQDPKCEN 
GAiaSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLHKOE 
LENLKCS1 LAC S AYVALALGDNLMA LNHADKLLQQP KLSG £ L>KF 
LGHLY AAEAL1 SLDR 1 SDAITHLWPENVTDVSLGISSNEODQGS 
DKGENEAMESSGKRAPQCYPSSVNSARTVWLFNLGSAYCLRSEY 
DKAR KCLHQAAS MI K PKEVPPEAI LLAVYLE LQNGNT0LAL-Q 1 1 
KRN0LLPAVKTOSEVRKKPVTQPVHP1QPIOMPAFTTV0RK 


6403 


2 


1690 


RG I KTS VLCGNbQNOMY SHNWI MNLWNLNLTQVOQRWL I TNLQ 
RS VDDTSQA 1 QR I KND FQNLO^VFLQAKKDTDWLKEK VQS LQTL 
AANHS AlAKAi^mLEDMNSOLNSFlGOKENITTI SCANEOMLK 
DLODLHKDAENRTAI KFNOLEERFQLFETDI VNI I SN1 S YTAKK 
1.RTLTSNLN E VR TTCTDTLTKHTDDLTS LNNTLAN I R LDS VS LR 
MQODLMRSR LDTEVAKLSV I KEEMKLVDSKHGQLI KNFT 1 LQGP 
PGPRGPRGDRGSCGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSCGPKGSRGSPGKPGPOGPSGDPGPPGPP j 
GKEGLPGPQGPPGFOGLOGrVGEPGVPGPRGLPGIiPGVPGhJPGP 
KGPPGPPGP5GAWPIAU)NEPTPAPEDNSCPPHHKNFTDKCYY j 
FSVEKEIFECAKLFCEDKSSHLVFIKTREEQQWIKKQMVGRESH 1 
W1GLTDSERFN EW KWLDGTSPDYKNV1 KAGQPDNWGHGHGPGEDC j 
AGL1 YAGQKNDFQCEDV1WF1 CEKDRETVl^SAL J 


6404 


1012 


" ■ 222 


AAALAMAAFAPCL I S VFSSSQEbGAAJ AQLVAQRAACCLAGARA J 
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SEC 
ID 
NO: 


Preci ctec 
beqanni nc 
nucieot ide 
locat ion 
cor responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
anino acid 
residue of 
amino acid 
sequence 


An*; no ac;c tecment containing sioua} peptide* 
(/>=/■ ianine , O Cysteine, D=Aspartic Acid, L= 
G>>:temic Acic, F=Phenyl alanine, G=Glycir,t , 
K*>-:i sti dine , 1 =isoleuci ne, K=Lysine, 
Lii-eucine, K=K-ethionine, N^Aspazagane, 
P= Proline, 0-G:ut amine, R=Arginine, 
S = 5-erine, T.T'r.reonine , V^Valine, 
W-'iryptopnan, Y=Tyrocine, X^Unknown, * = Stcp 
Cocon, /^possible nucleotide deletion, 
\-cossibie nucleotide insertion) 








R F A IX? LS GG S L> VS K LAH EL PAA V APAG PAS 1AR W TLG FCD F.R L V 
PFrK/^£STYGbYRTHl,L>SRLPlPESOVITINPELPVEEAA£DYA 

k k : . r cafqg: )S I PV fdlli lg vg pdght cs lfpdh PLLQERE X I 

VAr J SDSPXPPPQR VTLTLPVLHAARTVIFVATGEGKAAVLKRI . 
LE 'JOE EN P L P AALVQ PHTG K LC W FLDEAAARLLT V P F E KHS PL 


6405 


1 


1456 


AA." . F R PTPRAPLGR EG TGSDS E MAASM F YGRLVA VAT LR MKR P R 
TACKAMQVijGSSGLFNNHGLOVOQQOQR^LSLHEYMSMELLOE 
AGV S V PKGY V AKS PDEAY AI AX K LGSKDW I XACVLAGG RGXGT 
TEirGL KGG V K 1 V FS F B E AKA VS S QM I G K K L FTKQTG EKG R 1 CNQ 
VLVCE R KY PR R EY Y FAI TMERS FQGPVLI GSSHGGVN I EDVAAE 
TPF.aI 3 KEP I D3 EEG I KKEQALOLAQKMGFPPNI VESAAENKVK 

Ij - x_r x.i\ i li uc . cirri vrii/w' Ljyyn vajv,i il/aax i^h ^*Ji'' j a* 

KlFDLQDWTOEDEKDKDAAKANLNYlGLDGNIGa.VNGAGIAMA 
TKD; I KLHGGT PAN F LD VGGGAT VHQVTEAFKLITSDKKVLA1 L 
VN1 -r^TMTJrnVT^OnT VMAVKni,F*7KlPVVVRI/X^TRVDnAKA 

V IX J VTV_J _I J J \w V JL X V 1 l/l V [\^_JLJi-*A. i\ -i. gr V V v I\LJ\S\J J. IXV UUnfV/l 

LI /OSGLK : I*ACDDLDEAARMW KJLSE 3 VTLAKQAH VDVKFQLP 
I 


6406 


1036 


167 


HPROMRGEDTPEAPPYSSGRYDS 1 XTEVSG CPE DLTVG RAPTAD 
DDD^'DHDDKFD^K^WSEGMDPERLKAFTJMFVRLI'^DENLDRW 

VD 1 <'vr\DVPVT OZl t 1 PQrCDACDPrAPRl.PlfB T CTYT.KCrCOMV 
Vr J r<\Jr KJc ft 1 1 Lovonyr r yCKMrvfvrC I r* A I AjIVo^-JvKr'lA, 

KNGf«;EMTR PTF PHLTS AMAEN 1 LAAACES ETRXAAKRMRLE3 YQ 
SSCDEPIALDK0HSKDSAA1THSTYSLPASSYSQDPVYANGGLN 

CRAALGSGMGRGKCRPVMERGCLTA 




4 9^ 


15C 


VGLC^AVSOTVLAOl.DALLVFPGOVAOLSCTLSPOHVTJRDYGV 
SWYOORAGSAPRyLLYYRSEEDHKRPADlPDRFSAARDEAilNAC 
VL71 SPVQPEDDAJDYYCSVGYGFSP 


6406 


1458 


903 


RGC I TS SQAWR LFGC- VTRGFNMR J EKCYFCSGP IYPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRKPRKVRWTKAFRKAAGKELTVDN 
S FE r EKKRNEP 1 KYCR ELWNXT3 DAMXRVEB1 XQKRQAXF1KNK 
LKK^KEljQKVCDIKEVXONIHLlRAPLAGKGXOLEEKMVQOLOE 
DVDNEDAP 


6409 


ISC 


446 


l\T7i ! .AN U .R C FTCDR LCGGCTAPAP PAHQG 1 VLQPVMPSCDPGP 
GPArLPTKTFRSYLPRCHRTYSCVHCRAHLAXHDEIilSKSFOGS 
HGRAYLFNSV 


6410 


85 


607 


RGG '. AGCVACLGCWGQSS S PKAAF PAGS ACLPADSC PCLL F0AC 
AISGLFNC1 T3 KPLN1AAGVWM3KNAF3LLLCEAPFCCQF1EFA 
NTVAEKVDKLRSWQX/iVFYCGMAWPIVISLTLTTLLGNAlAFA 
TGV1 Y GUSALGK KGDA1 S YA-RICQORQOADEEKLASTLEGEL 


6411 


3 02 


772 


RLSJMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAG3 AVLFKKKFGGVGELLNQQKKSGEVAVLXRDGRyi YYH TK 
KRASHKPTYENLOXSLEAMKSKCLKNGVTDLSMPRIGCGLDRLQ 
WEN\' SAM I EEVFEATD J K 3 TVYTL 


6412 


61 


1709 


RPVTS FS PLPGS CGGK LGTRTMLGRS LREVSAALXQGCjI TPTSL 
CQKC LSLI XXTKFLHAYITVSEEVALKQAEESEKRYKNGOSLGD 
LDG • PJAVXIWFSTSGlETTCASNMLKGYIPPyNATVVOKLLDO 
GALLMGKTNLDE FAKGSGSTDGV FG P VKNPHS YS KQYREKR KQN 
PHSF N EDSPVfL 3 TGGSSGGSAAAVS AFTCYAALGSDTGGSTRN P 
AAK C G LVGFX PS YG LVSRHG L I PLVNSHDVPG J LTR C VDDAA I V 
LGALAGPDPRD STTVHEP IN KPFMLPS LADVS KLCIG I P KE YLV 
PELSSEVOSLWSXAADLFESEGAXVIEVSLPHTSYSIVCyHVLC 
TSEVASNMARFDGLCYGHRCDI DVS TEAM YAATRREG FNDWRG 
RILrGNFFLLKElTYEbTYFVKAQKVRRLIANDFVNAFNSGVDVliL 
TPT7 LSEAVFY LEF 3 XEDNRTRS AQDDI FTQAVNMAGLPAVS I P 
VA15NOGLPIGL0F1 GRAFCIXJQLLTVAKWFEKQVQFPV J 0L0E 
LMDrCSAVLENEKLASVSLKQ 
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SEQ 
ID 

NO: 


Pred i cte.c 

nucieotiae 
locat i cr. 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted enc 
r.url eot i de 
location 
cor responding 
to first 
emino acid 
residue of 
amino acid 
sequence 


Aiuno i>cid sccment containing signal peptide 
(A^AJar.ine, C=Cyste:ine, D-Aspartic Acid, E= 
Glutamic Acid/ F=Phenyl alanine, G=Glyc-ine, 
H=Hi.st;dine, 3=Isoleucine , K=Lysine, 
^Leucine, M= Methionine, N=Asparagint 
?=Proljne, O^Glutamine, R=Arainine, 
S = Serine, T«T'nreonine , V-Valme, 
W=Tryptophan, Y>= Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion. 
\=possable nucleotide insertion) 


6413 




885 


H E PR CAG f'lAAS ISA MG DL-E P Y MPEN F 1 SRAFATMGETVMSVK1 3 R 
NRLTG3 PAGYCFVEFADLATAEKCLHK1NGKPLPGATPAKRFKL 
NYArVGKOPDN!SFEYSLFVGDLTPDVDDGMLyEFFVKVYPSCRG 
GKWLDCTGVSKGYGFVKFTDELEQKRALTECCGAVGI>GSKPVR 
LSVA1PKASRVKPVEYS0MYSYSYNQYY0OY0NYYAOWGYDQNT 
GSYSYSYPQYGYTCSTMOTYEEVGDDALEDPMPCI'DVTEANKEF 
MECSEELYDALMDC1IW0PLDTVSSEIPAMM ; 


6414 


1 


538 


RGGRAALLPWRRFPCCRPKPQPARPSSRATPGPRSPG:4AT£3GV 
S FSVoDGV PF.AEKNAGEPENTYILRPVFOORFRPSWKDCIHAV 
LKEEUVN A£ YS PEEMPQLTKHLS EN 1 KD K LXEMGFDR Y l-GWVQV 
VIGEQKGEGVFM^.SRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


641B 


2 


1168 


FVRQWOS S HRRACGl^GCEARAGGGEEPRGRASSVAGWVGAFRAP 
FIEAAVAG LGAGSGKRRRGWKMPVHSRGDKK ETNHHDEMEVDYA 
ENEGSSSiiDEDTF.S.SSVSEDGDSSSiMDDEDCERRRMECLDEMSN 
LEX0FTDLKDOLYKERLSQVDAKL0EV1AGKAPEYLEPLATLOE 
NMQ1RTKVAG1YRELCLESVXNXYECE10ASR0HCESEKLLLYD 
TVQSELEEKIRRLEEDRHSID1TSELWNDELQSRKKRKDPFWPD 
KKXPGWS GPY1 VYMLQDLDI LEDWTT1 R KAMATLG PKRVKTEP 
PVKLE'KHLHSARSF.EGRliYYDGEWYIRGOTl CIDKKDECPTSAV 
1TTI NHDE V WFKRF DGSKSKLY 1 SQLCXGKY S I KttE 


6416 


4 1 C 


1519 


EIAFADLE2PACAPVLLSRATSS7MSVTGGKMAPSLT0E1LSHL 
GLAS KTAA WGTLGTLRTFLNFS VDKDAORLLRA I TGOG VDRSAI 
VDVLTNRS R EQRQLI SRJ^ FQERTQQDLKKSLQAAbSGNLER I VM 
ALLQ?TA0FDAQELKTAt>XASDSAVDVA3 El UATRTPPQ^QECL 
A V Y KHN FC V EAVDG 3 TS ETSG I LQDLLLALA KGGRDS YSGIIDY 
NIAEODVOALORAEGPSREETWVPVrTCKNPEHLIRVFDOYORS 
TGCEbEEAVCNRFHGDAQVALLGLASVJXNTPLYFADKLHQALO 
ETEPN YCVL 1RI LIS RCETDLLS 1 RAEFRKXFGXSLYSS LQDAV 
KGDCQSALLALCRAEDM 


6417 


i 




"R GES R VLW S ELEGEAGGAGGWAS S LN ARMDN R FATA FV I ACVLS 
LI STI YMAAS1GTDFWYEYRSPV0EKSSDLNKSIWDEF3 SDEAD 
EKTTODALFRYNGTVGLWRRCIT3PKNMI1WYSPPSRTESFDWT 
KCVSFTL7EQFMEKFVDPGNHNSG2DLL.RTYLWRCOFLLPFVSL 
GLMCFGA10 GLCACI CRSLYPTIATG I UiLtAGLCTl jGSVSCYV 
AG 1 ELLHCKLELPDNVSGEFGWSFCUACVSAPLOFMASALFI WA 
AKTNRKEYTI.MKAYRVA 


6418 


2 


662 


TKTRPRRPPGLGAAVGKAGARSTSTPAGASF/iAAYOADPPPPAH 
TP APPP PF P CGG I ACHGEPAKFYG YDNLQRQP I FTTOOEAELVO 
YPDCKS SSG H 3 GEDPDHLNQSSS PSQMFPWMR PQAAPG RRRGRQ 
TYSRFOTL.ELEK£FLFNPYLTRKRRI EVSHALAiTERCVKl WFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAOELEEDRAEGLTN 


6419 

1 


1 


973 


PGR PR VRN FDLNS KS I LQEFFCTR S I Ql PANR SXTAMS KCP I r P 
NlAR S 3 S TS G PkDKEDTGROXbl STGS 3uP ATLOG ATDS 1>GLE WH1> 
PSPDPVTVP YLS PLVVWKELESLLEMEGDHAI TVADFVDHHPI V 
FWNLVW Y FRR LDL PSNLPGL I LSS BH CN K YSK I P RH CMS EDS K Y 
VLI OMLWDNMKLHQDPGQPLY I LWNAHTOKY PMVHLLOKSDNS F 
^ELLKSWKSIKW^DVYGPMSQILETLNKCPHFXRCRSLYREI 
UFl^SLVALGRENIDIDAFDKEYKflAYDRliTPSQVKSTHNCDRPP 
STGVMECRXTFGEPYL 


6420 


207 


1187 


3KM3DKNOTCGVGQDSVFYMI CLim LEEWFGVEQLEDYLNFAN 
YLLWVFTPLI LLI LPYFT1 FLLYLTI I FLHI YKRKNVLKEAYSH 
N LWEGAR KTV ATLW DG HAAVWHG YE VHGME K I PEDGPALI 1 FYH 
GAI FIDFYYFKAK I FIHKGRTCRWADHFVFKI PGFSLLLDVFC 
ALHG PR EKCVEI LRSGHLLA I S PGGVREALI S DETYN I WiGHRR 
GFACVAI £AKVPI 1 PMFTONIREGFRSLGGTRLFRWLYEXFRYP 
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BNSOOC1D: <WO__ 01S3312A1..I. > 



WO 01/53317 



n:m\soo/3.i263 



SEO 
ID 
NO: 


Predicted 
bee inning 
nuclcot idt 
1 ocat icr. 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted enc 
nuclect ide 
location 
correspond inu 
to first 
amino acid 
residue of 
amino acid 
sequence 


/•.nuno acid segment containing sicnal peptide 
(AsAjianine, O Cysteine, D-As?<irtic Acid, E = 
Glutamic Acid, F=Fhenylalanine , G=Glycine, 
H-Histjdane, I ^Isoleuci ne , K-Lysine, 
l-Leucme, ["^Methionine, N=-Asparsgine, 
P=Proline, Q^Glutamine, R=Arqinine, 
S=£erine, T- Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Un\nown, *^Stcp 
Codon, /-possible nucleotide deletion, 
\=poscible nucleotide insertion) 






F AFMYGGF PVKLKTYLGDP 1 P Y DPQ1 TAEELAEKTKNAVQAL I D 
KBQRI PGN 1MSALLERFH 


6421 


1844 


362 


W,M,S1^»C>PE^SNKLLSPHPHSWLRSEFKMA5£PAVLRASRL 
YOWSLKSSAOFLGSPOI.ROVGOIIRVPAKilAATblLEPAGRCCW 
DEPVR 1 AVRGLAPEOPVTLRAS LRDE KGALFC-AHARYRADTLGE 
LDLEIiAPALGGSFAGLEPMGLLWALEPEKPLVRLVKRDVRTPLA 
VELEVLDGHDPDPGRUXQTRHERYFLPPGVRREPVRVGRVRG? 
LFLPFEPGPFPGlVDMFGTGGGLbEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETl.HLEYFEEAKNYLLSHFEVKGPGVGLliGlSKGG 
ELCLSMAS FLKG1 TAAW1KGSVANVGGTLRY KGETLPPVGVNR 
imi KVTKDGYAD1VDVL.NSPLEGPDQKSF] PVERAESTFLFLVG 
QDDyiH W KS E F YANEA C'KR LQ&llGRRK PQ 1 1 CYPETGHYl EPPYF 
FLORAS LHAWGS PI I WGGEPRAHAMAOVDAWKCLQTFFHKHLG 
GREGTI PSKV 


6422 




2132 


EG ENLS W FQE FWGI) I A K EFY WKTPC PGP FJLR YNFDVTKGKI FIE 
WMKGATTWI CYNVLDRMVHEKKLGDKVAFYWEGNEPGETTQITY 
HOLLVCVCQFSNVLRKQG1 HKGDRVA1 YMPMI PELWAMLACAR 
IGALHSlVFAGFSSESLCERlLDSSCfiLLlTTDAFYRGEKbWL 
KKLADEALQKCQEKGFPVRCC3 WXH1 .GRAELGMGDSTSQSPP1 
KP S C FDVC1 SWNCG I DLWWHELMQEAGDECE PE WCDAEDPLF J L 
YTSGSTGXPXGWHTVGGYMLYVATlFKYVFDFHAfDVFWCrAn 
IGWITGHSYVTYGPIiANGATSVLFEGl PTYPDVNRLWSI VDKYK 
VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WJ ,W YHR WGAQRCP I VDTFWQTETGGHttLTPLPGATPMKPGSAT 
FPFFGVAPAII^ESGEELEGEAEGYLVFKOPWPGIMRTVYGNHE 
RFETTYFKKFPGYYVTGDGCQRDQDGYYVnTGRTDDMLNVSGHL 
LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
FSPKLTEELKXQIREK1GP3ATPDY10NAPGLPKTRSGKIMRRV 
LRKZAONDrtDLGDMSTVADPSVlSHLFSHRCLTIQ 


6423 


614 


1237 


ANLXE2 PRDLPPETVLLYLVSNQ I TS J PNE J FKJDLHQLRVLNLS 

FJxkjlr.f 1 U&rLHt KLrV/ic 1 Lj\J± UUJU^lsrvHXVJ vni\r*r\r WI>lI>A-/VCft 

R 1 AKNPWHCDCTLOQVLRSMASNHETAHNVJ CKTSVLDEHAGR P 
FLNAANDADLCNL PKKTTD YAMLVTM KGWFTMVI S YWYY VRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTVV 


64 24 


1 


1196 


KKVSWpVAAMVKCSCVLFRKYGNFIDKLRLrrRGGSGGMGYPRL. 
GGEGGKGGDVWWAHNRMTLKOI'KDR Y PRKR FVAGVGANS K I S A 
LKGSKGKDVJE1 PVPVGISVTDENGKI T GELNKENDR I LVAQGGL. 
GCKLLTNF"LPLKG0KR13HLDLKLIADVGLVGFPNAGKSSLLSC 
VSKAXPA I AD YAFTTLKPELGK IMYSDFKQJ S VADLPGLI EGAH 
MNKGMGHKFLKHI SRTROLLFWDI SGFQLSSUTQYRTAFETI I 
LLTKELELYK£ELQTKPAi,LAVNKMDLPDACDKFHELMS0LONP 
KDFLH LFEKNM2 PERT VE FQH 1 1 P I S A VTGEG J E ELKNCI RXSL 
DEC ANQEMDAlrH KKQLLNu W I S DTMS S TE P PS KHAVTTS KM D 1 1 


642S 


1850 


1144 


LAKEGGGG 1 PI,ETLKEESOS RHVLPAS F EVNSLQKSNWGFLLTG 
LVGGTLVA\nfAVATPFVTPALRKVCLPFVPATMKQlENVVKMLR 
CRRGSLVD3GSGDGR1VIAAAKKGFTAVGYELNPWLVWYSRYRA 
WREGVHGSAKFYI SDLWKVTFSQYSNVVI FGVPOMMLQLBKKLE 
RELEDDAR VIACRFPFPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KR PCTSMH FQI'P I QA 


6426 


30 


565 


£ RG AAVGG MS V AGG E 1 RGDTGGEDTAA P G R FS FS PE P TLED J R R 
LHAE FAA ERDWEOFHQPRN LLLALVGEVGELAELFQVi KTDGEPG 
PQGWS PR E RAAiQEELSDVLI YLVALAARCR VDLPLAVLS KMDI 
WRRR YPAHLARSSSRKYTELPHGAI SED0AVGPAD3 PCDSTGOT 

ST 


6427 


14S 


955 


AASWG PPH VPKAGKMVSWM 1 CRL WL VFGMLCPAYAS YKAVKTK 
NI R2 YVR V7M W WI VFALFMAAEI VTD1 F I S W FP FY YE I KMAFVL 
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BNSDOC1D: <WO 0153312A1_L> 



WO 01/5331? 



P<T/USUU/34263 



SE<j 


Predicted 
been nniny 
iiuc i eot i de 
location 
ccr r esponding 
to first 
citnino acid 
residue of 
amino acid 
sequence 


Predicted unc 
nucleotide- 
1 oc ei t i on 
ccr i e Dponc i ng 
to tirr.t 
amino acic 
residue ol 
amino acic 
sequence 


Amine acid segment c c r.t ai nir-y signal peptia e 
lA-Aii'-aine, C^Cysteii.e , D-A*:p?.rtic Acid, E- 

H- rh stidine # I =-lsoleuclne , K- Lysine, 
LsLfUcanej M=M(thioni ne , N=Asparagine , 
P^Prcline, Q=Glutamine, RrAroir.i ne , 
S-Serine, TsThreonuie, V=Valine, 

Tryptophan, Y=Tyroeme, X=Unknown, *=Stcp 
Ccacn, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








WLLf'PYTKGASLLYRkFVHFSLSRHtKElDAyiVQAKERSYETV 
LS FC K KGLN I AAS AAVQAATK SOG ALAGR LP S FSMCDLRS I SDA 
FAP;,yKi;pi,YX,KD0VSHRRPF2GYRAGGL0DSDTECECWSDTEA 

vpiu.p/'j?prekpbirsoslrvvkrkppvregtsrslkvrtrkkt 
vpsi^vd^ 


6428 


1982 


4 4': 


SGS0GK v jEDHQKVPID1QTSKL LDWLVDRKKCSLKWQSLVL>TlR 
EK 1 KA Al ODMP ESEE1 AQLL S OS Y I H Y FHCLR I LDLLKGTEAST 
KNl FGRYSSQRMKDWQEI IAl.YEKDNTYLVELSSl/uVRNVNYEl 
FSliKKCI AKCOQLOOKYSRKEEECQAGAAEWRE0FYHSCKQYGI 
TGEr: V^GELLALVKDLPSQLAE J GAAAOOS LGEAI DVYQASVGF 
VCE£PTEQVLPMLRFVQKRGXSTVYEWRTGTEPSWERPHLEEL 
PEQVA EDA 3 DWGDFGVEAVSEGTDSG I SAEAAG I DWG I FPESDS 
KDPC G DO 1 DWGDDAVALQI TVLEAGTQAF EG VARGPDALTLLEY 
TETKMOKLDELMELE1FLA0RAVELSEEADVLSVS0E0LAPA1L 
CGOTKEKMVTfWSVLEDiaGKLTSLOLOHLFM^LASPRY-VDRVT 
EFLCOKI.KOSOLLALKKEl^MVUKOOEALEFOAALEPKLPLLLEK 
TKELQKL1BAD1SKRYSGRPVNLMGTSL 


6423 


3413 


344^ 


EPS S V7TAA P RG P LAAH P LEJUvVg E DDR RAI ,$FDSRI KV F ANGTL 
W KSV TOKDAGDYLCVARN KV GHDYW LKVD WM KP AKI EHKEE 
NDHK^FyGGDLKVDCVATGLPNPEISWSLFDGSLVNSFMQSDDS 
GGRTK R y W FNNGTLYFNE VG MR E EG D YTC FAENOVGKDEMR VR 
VKVv TA PAT I RN KTCLAVOV ? Y GD WT VACE AKG E PMPKVTWLS 
PWKVIFTS£EKYOIY0DGTL.I,IOKA0RSD.SGNyTCLVRNSAGE 
DRK'J'VW j ilVKVQPPKJNRNPNPITTVREI AAGGSRKLIDCKAEG 
I PTFK VI /.\'AFPEGWLFAPYYGNR I TVHGNGSLD I RSLRXSDSV 
OLVO'ARNEGGEARLIVOLTVLEPMEKPIFHDPISEXITAWAGH 
TISUJC^AAGTPTPSLNAfsnjPNGTDLOSGQOLORFYHKAIXJMLH 
I SGLf S V DAG AY RC V ARKAAGHTERLV SLKVGLK P E AN KQYHN L 
VSI 1 NGF.TLKLPCTPPGAGCGRFSWTLPNGMKLEGPQTLGRVSL 
LDNGTLTVREASVFDRGTYVCRttETEYGPSVTSl PVIVIAYPPR 
I TS E P T p VI YTR PGNTVKLNCMAMG 1 P KAI> I THE LP OKS H h KAG 
VQAKLl t.iNRFLH PUbbL 1 1 UHA J UKJJAL>r iKL oiakjm jl i_Ajj>U£>Jt J 
TYJHVF 


6430 


194 6 

1 


602 


rtrvi tglrrtllwseavgas s trgdtg i pgy geggagpgggeg 
amle/.^iaepspbdppptlkpe'j'oppekrrrti edfnxfcsfvla 
yagy: ff ske:esdwpasgsss ? lrgesaa^segwdsapsplrti 

yi r v a r*>;/\jD jvr\ xw^^/tur i v j v_j f r\ o j r^prv x-v* 1 *^ x xjlju *u i 

xlkdhlfdldgpkvasplspt^ lthtsrppaaltpv plsqgdls 
hpprkkjdrxrjrkxgpgagagkgvlrrprptfgdgekrsrjkksk 

KRKL,KKAERGDRXJ>PPGPPQAP?SDTDSEEEEEEEEEEEEKEMA 

twggeapvpvlptpperprppatvhpegvppadseskevgste 
tsqdgdasssegkmrvkdedimvesgddswdlitcycrkffagr 
pmiecslcgrwihlscakikkt.wpdffycokckelrpearrlg 

GPPK5GEP 


6431 


3 


605 


WWNS £ VN LPA YAPYLPCEACAKQDGR KGGA Y AG KM EATTAGVGR 
LEEEIALRRKERLKALREKTGRKDKEDGEPKT3CHLREEEEEGEKH 
RELRLRN YVPEDEDLKKRRVPOAK PVAVEEK VKEOLEAAKPEPV 
IEEVDLANLAPRKPDWDLKRDVAXKLEKLKKRTQRAIAELIRER 
LKGQEDS luASAVDAATEOXTCDSD 


6432 


56 


1692 


GGLGTMGSRIKONPETTFEVYVEVAYPRTGGTLSDPEVOROFPE 
DYSD0EVL0TLTKFCFPFYVD5LTVS0VG0NFTFVLTDIDSKQR 
FGFCRLSSGAKSCFCILSYLPKFEVFYKLLWILADYTTKRQENO 
WNELLFTI.HJOjPIPDPGVSWLQWSYFTVPDTRELPSIPENRII 

ltey f v a vdvnnklh l y askly e rri li 1 cs xlstltac i hgs a 
amly pm% 7 qhvyi pvlp phlld y ccapmp yli g ihls lmekvrn 
malddwilitvdtntletpfddloslfndvisslknrlkkvstt 
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BNSDOCID: «WO_0153312A1_L> 



WO 01/53312 



PCT/USOO/34263 



SEC 

ir 

NO 


Predicted 
beginning 
nxic.J eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted < : >nd 
nvcl eotidc 
location 
correypendmg 
to first 
arr.ino acid 
residue of 
amine acid 
sequence 


Amino 2cic~ secment. containing saonal pepT:~de~~ 
(A^Alyniiif, C-Cysteine. D=Aspartic Acid, z~ 
Glutamic Acid, F= Phenylalanine, G=Giycine, 
H»Histidine, 3=~soieucine, K-^Lysine, 
L- Leucine , .^Methionine, N-Asparagine, 
P=Proldne, C-Glutamine, R=Arginine, 
S^Serine, T= Threonine, V=valine, 
VUTryptophnn, Y=Tyrosine, X«\Jnknown, *-Stcp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TG DG VAKAF L-KAQAA FFGS YRA'ALK I EPEB P J T FCEEA F V SH YR 
SGAMR0FL0NATQLCLFK0F1DGRLDLLNSGEGFSDVFEEE1NM 
GEYAGSDKLYHOW'LSTWKGSGAILNTVKTKANPAMKTVYKFEI 
AENGCAPTPEEOL.PKTAPSPLVEAKDPKLKEDRRP1TVHFG0VK 
PPRFHVVKRPKSMIAVEGRRTSVPSPEQNTIATPATLHILQKS1 
TKFAAKFPTRCtiTSSSH 


64 3:- 


1S24 


484 


APVTKRKEVFAK^SKGSALPAGRDPKRPALPETLCESGWASNTA 

PTTPPOPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 

LFLLLLL.VATTGPVGALTDEEKRLMVELHNIiYRAQVSPTA£DML 

HKRWDEELAAFAKAYAKOCVWGHNKERGRRGENLFAITDEGMDV 

PLAMEEWHHEPEHYNLSAATCSPGQMCGHYTOVVWAKTERIGCG 1 

SHFCEKLOGVErTNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 

SGYHCKNSLCEPIGSPEDAQDLPYLVTEAPSFRATEASDSRKMG 

AEGPDKPSW-SGLNSGPGHVWGPLLGLLLLPPLVLAGIF 


£434 




2002 


MPOLNFGMADPTOMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE < 
SPELR QK SPL FQF AE J S S S TSHSDASTKQCQTS Al.FQFAE I S SN ' 
TSOLGGAEPVKRCGKSAjbKOI^AEMCLASEGWW-lEESKLlKAKES 
DGGR 1 KE LKKG KE E KEI KMEKTDETRLOKEAEFEKSAKENLRDS 
KELRNFEADQ J DDIMAI KMEPPKEIRKEELEEDHKC5HFPDFSY 
SASSKIIISDVPSRKDHMCHPHGIMIIEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSrPKKTCKKRQSSESDIESVI YTIEAVAKGDWG 
I EKI .GDT PRK K VRTS SSG KGS I LDAKP PKKKVKSREKKKS K EKS 
SDTTKESRPPDr; S I SAS KNI SGETPEGI KAEPLTPMEDA1>PPS 
IjSGCAKPEDSDCKRKIETCGSRKSERSCKGALY KTLVSEGMLTS 
LRANVLRGKRSSGKGNSSDHEGCWNEESWTFSOSGTSGSKKFKK j 
TKPKEDCliLGSAKLDEEFEKKFUSLPQYSPVTFDRKCVPVPRKX 
KKTGNVSSEPTKTSKGSGDKWSNK0LFLDAIHPTEA1FSEDRNT 
ME PVU KVKN 1 PS 3 FN TPEPTTTARTFGGOP KEKSKENPDY S PCQ 
DTORAGYFIHEEVLWMTNl^^CGGVYLKOLRHTAMTNA 


643S 


2227 


657 


ALQRDAAAAYAH PE Y EER FL0EE7VSQQ I NS I ELLQTR PLALPE 
WKSCRP^QRPVHLRGRPASOPTVIRGITYYKAKVSEEENDIEE 
0ODEFFSGDNGVDLLIEDOLLRHNGLMTSVTRRPAATRQGHSTA 
VTSDLNARTAPWrSALPOPSTSDPSIAmiASVGFTLQTTSVSPD 
PTH ESVLQ P S PO V P ATTVAKTATQQ PAAPAP PAVE PR EALttEAM 
HTVP V P P TT VR TBS 1 ,G KD AP AGRGTT P AS PTLS F EEE DD I RNV t 
GRCKDTLST1TGPTTQKTYGRNEGAWMKDPLAKDERI YVTNYYY 
GiaTbV EFRNLEN )• KQGRWSNS YKLPY S W 1 GTGHWYNGAF YYl^R 
AFTRN1 1 K YDLKOK Y VAAWAMLHDVAYE EATPWRWOGHS DVD FA 
VDENGLWLI Y PA1 DD EGFSOE VI VLSKLNAADLSTOK ETTWR TG 
LRRNFYGNCFVJCGVLYAVDSYN0RNAN1STAFDTHTNT0JVPR 
LLFENEYFYTTQI DYNPKDRLLYAWDNGHQVTYHV1 FAY 


643t 




341 


GACRPPVRQDPDSG PDYEALPAGATVTTHMVAGAVAG 1 LEHC VM 
YPIDCVKTRMOSLQPDPAARYRNVLSALWRIIRTEGLWRPMRGL 
NVTATG AG PAHA LY FACYEKLKKTLSDV1 HPGGNSH1 ANGAAGC 
VATLLHDAAMN PAF. W XQRMQM YN S P YHR VTDC VRAVWQN EGAG 
AFYRSYTTOLTKJ>IVPF0AlHFMTYEFL0EHFNPORRYNPSSHVL 
SGACAGAVAAAATTPLDVCKTLLNTQE^J.ALNSHITGHITGMAS 
AFRTVYOVGGVTAYFRGVQARV3 Y0I PSTAIAWSVYEFFKYLIT 
KRQEEWRAGK 


6437 


1826 


360 


PPAPAPPASPARHVTRTARGHLBGGSRAPPLLQAVFLQ I XNMVK 
L3 HTLADHGDDVWCCAFS FS LliATCSLDKTl RLYS LRDFTELPH 
SPLKFHTYAVliCCCFSPSGHIIASCSTDGTTVLWNTENGOMlAV 
MEOPSGSP\mVCQFSPDSTCU^GAAJX5TVVLVmAOSYKLYRCG 
SVKDGSIJ^ACAFSPNGSFFVTGSSCGDLT/VTODKMRCIjKSEKAH 
DLGITCCDFSS0PVSDGEOGLOFFRLASCG0DCOVKIW2VSFTH 
1 LG FELK YKSTt S GHCAP VLACAFS HDGOMLVS GS VDKS V I VY D 
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BNSDOCID: <WO 01W312A1_I_> 
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SV.Q 
NO: 


Precucted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid cecment containing signal peptide j 
(A^Alonine, (^Cysteine, L^Aspartic Acid, t= j 
Glutamic Acid, F=PhenyI a i anine , G=Glyc:ine, j 
H^Histidine, ] - 1 soleucme , K=Lysine, j 
L= Leucine, M=Kethionine, N-Asparagine, 
P=Proline, 0=02 ut. amine , R^Arginine, 
SsSerine, T-Thieonine, v=valine, 

Tryptophan, Y*Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpcssible nucleotide insertion) 









TNTEIQ I LHTLTOHTK Y VTTCAFAPK1XL1.ATGSMDKTVN 1 W0FD 
LETLCOARSTeHOIKOFTEDWKEEDVSTWLCAODLKDLVGIFKM 
NNI DGKELLNLTKES LADDLK1 ESI.GLRSKVLRKIEELRTKVKS 
LSSGI PDEF ICPlTr! ELMXDPV1 AS DGYSYEKEAMENWDPAKKN 
R?SPP 


643B 


10S 




EVQ1 LRAKMFQTGGM VFYGLLAQTfMQFGGLPVPLDQTLPLNn/ 
NPALPLSPTGLAGS:,TNALSNGLLSCGLLG3 LENLPLLD 1 LKPG 
GGTSGGLLGGLLGKYTSVIPGLNN J J Dl KVTDPOLLELGLVQ5P 
CGHR liYVTl PLG I Kl QVNTPLVGAS I *LRLAVKI»DITAEI LA VRD 
KQERIHLVLGDCTH?PGSLQISLLDGLGPLPIQG1>UDSI>TGILN j 
KVLPELVOGNVCPLVNEVLRGLDl TLVHD2 VNMLIHGLQFV 1 KV 


643S 


23 


4:; 


S I QTASAI TTEMAS CSQG I QQLLQA € KRAAEK VADARKR KARR L | 
KQAKEEAQKEVEOYRREREHEFQSKOQAAMGSOGNLSAEVEOAT 
KRQVQG>4QS S0ORNRERV1A0LLGKV CDVRPOVHPNYK J SA 


6440 


3 


si-, 


RARWNSDMGDLPGLVKLSIALRIOPKDGPVFYKVnGORFGC>NRT 
I KLLTGSSYKVEVKI KPSTLOVEN J S 3 GGVLVPLELKSKEPDGD 
RWYTGTYDTEG VTPTKSGEROP 1Q I TMPFTD I G TFBTVWQ V K F 
YNYHKRDHCOMGS?rsVIEYECKPNETRSLMWVNKESFL 


6441 


234 


137i 


KSGGLRRRQRPGRSAWGEEELPPGMKKFKAAMLLGSVGDALGY 
RNVCKENSTVGMKICEELORSGGLDI'LVLSPGEWPVSDNTIMHJ 
ATAEAliTTDYVCLD^LYREMVRCYVEI VF.KLPERRPDPAT1EGC 
AOLKPNNYLIiAWH'1'VKNEKGSGFGAATXAMCIGIjRYWKPERLET 
blEVSVECGRMTHNHPTGFLGSLCTALrVSFAAOGKPLVQWGRD 
MI.RAVPLAEEYCRKTI RKTAEYQEHWKYFEAKWQFYLEERKI SK 
DSENKAIFPDKYDAElREKTYRKWS?EGRGGRRGHDAPM1AYDA 
LLAAGNS WTELCH RA.MFKGGESAATGT1 AGCLFG LLYGLDLV PK 
GLYQDLEDKEXLEDLGAALYRLSTEEK 


6442 


34 


7SK 


AED P AGGLAGQD T MFARG L KR KCVG H E E D V EG ALAG lj KTVS S Y S 
L.OR0SLL.DNJS LV K LOLCH MLVE PNLCR SVLI ANTVROI OEEMTQ 
DGTWRTVAPQAAERAPLDRLVSTE1 J,CRAAWGQEGAHPA£GLGD 
GHrQGPVSDLCPVTSAOAPRHLOSSAWEMDGPRENRGSFHKSLD 
QIFETLETKNPSCMECLFSDVDSPYYDl.DTVLTGMMGGARPGPC 
EGLEGIJVPATPGPSSSCKSDLGELDKVVEILVET 


6443 


2 


551 


KASPAAS S VR F F R P K aEPQTLV I PKN;iAEEOKLKLER LMKN FDK 
AVPIPEKMSEWAPRPFPEF\ r RDVMG5SAGAGSGEFHVYRHLRRR 
EYQRODYMDAMAEKOKbDAEFQKR^EKNKIAAEEQTAKRRKKRQ 
KI.KKKKLLAKKKKLEOKKOEGPGOPKEOGSSSSAEASGTEEEEE 
VPSFTMGR 


6444 


390 




6STPRGKMRAP1PEPKPGDLI £1 FRPFYRHWAIYVGDGYWHLA 
P PS EVAGAGAAS VMS ALTDKA 1 VK KE LLYDVAGS DXY QVNN KHD 
DKYSPliPCSKIlC/RArEbVGOEVLYKLTSENCEHFVNELKYGVA 
RS DQVRDVI IAAS VAGMGLAAM S LI G VMFSRH KRQKQ 


6445 


2 


753 


AGAAGAAGAARS PRPGAHTKGVRGLP £ RRRSPDCGRMELAAGS F 
S E EOF WE ACAELQQ P A LAG AD WQLLV ETSG I S 1 Y RLLDKXTGLY 
E Y KVFGVLEDCS P Th LA*) I YMDS DY R K QWDQYVKE LYEQE CNGE 
TVVYWEVKYPFPMSNRDYVYLRQRRDLDMEGRKIHVILARSTSM 
POIjGERSGVIRVKOY KOSLAI ESDGK KGSKVFMYYFDNPGG03 P 
S WL.INWAAKNGV FNFL KDMARACON YLKKT 


6446 


1 


1652 


RCPTRSPPPDTPGSRG TTAMCS LASG ATGGRGA VENE E DLP ELS 
DSGDEAAWEDEDDADLPHGKOOTPCLFCNRLFTSAEBTFSHCKS 
EHQFN I DSMVHKHGLE FYGY I KL I NF3 RLKNPTVEYMNS 1 YNPV 
PWEKEEYLKPVLEODLLLQFDVEDLYEPVSVPFSYPNGLSENTS 
WE KLKHMEARALSAEAALARAREDLOKMKQFAQDFVMKTDVRT 
CSS STSV1 ADLOEDEIjGVYFSS YGH YG 1 KEEMLKDXJ RTE SYRD 
FI YQNPHI FKDKWLDVGCGTGI LSMFAAKAGAKKVLGVDOSEI 
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BNSOOCID: <WO 0153312A1 J„; 



WO 0I/533H 



PCT/US00/342A3 



SEO 
ID 
NO: 


Prvdicteo. 
beginning 
nucl eot i de 
1 oca t ion 
corresponding 
to first 
amino acid 

IcSlQuc OI 

ammo acic 
sequence 


Predicted end 
nucleotide 
location 
ccr responding 
to first 
amino acid 
residue of 
aini no acid 
sequence 


Amino acid segment containing s:ynal pep-ide ) 
tA^Alanine, C=Cysteine, D=Aspartac Acid, i> 
Glutamic Acid, ?- Phenylalanine , G^Glycine, 
H=Histidine, l=Iso^eucine, K=Lyi"3ne, 
L=Leucine, M=Methi onine, N^Asparsgine, 
Peproline, 0=Glutamine, R*Arginane, 
S«Senne ( T= Threonine, WValine. 
W* T ryp t ophan , Y*Tyrosine, X= Unknown , * — S t cp 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 


. 






LYCAMDI I R LNKLEDT2 TL ) KGXIEEVHLPVEKVDV1 ISEWMGY 

Ct 1 CCCM1 riO\l! V*Vfc1L r "V'T 7. V^/^CMTV TiT\ T r"f C T \r 7i 1 TKt \/ 1 T > 

r bhr ti>rll>UoVL l AKJVK.1 l^K^GbVxPulCrjibbVAVSDVN KHA 
DR T AFWDDVYC F KMS CMKKA VI PEAWE VLD PKTL I SEPOG 1 KH 
1 DCHTTS iSDLeFSSDFTLKITRTSMCTAI AGYFDI YFEKNCHN 
RWFSTG PQSTKTHWKQ7 VFLbEKPFSVKAGEAbKG KVTVH KNK 
KDPRSbTVTLTLNNSTCTYGLQ 


6447 


15S<i 


1068 


R LGP AEV2 HLSGPCHATLGAAN RGRAbGVRAA W RGA P LCQRVMMP 
SRTN'LATGIPSSKVXYSRLSSTDDGYIDbOFXKTPPKIPYKMA 
I lATVbFb 1 GAFL1 I iGSLLLSGYlSKGGABRAVPVLl 1GILVFL 
PGr YHLR3 AYYASKGYRGYSYDD1 PDFDD 


6448 


74 | 559 
t 


GQVbSHCYHYRSSRVWRGGLSRGRGAGVMAbVPYEET7EFGLC;K 
FHKPlATFSFANHT3QIR0DWRHliGVAAWWD.V\IVLSTYLEMG 
A VEbRCR SAVELGAGTGbVG 1 VAALLACR 1 R Y ERDNNFbAMbER 
OFIVRKVHYDPEKDVHIYEAOKKNOKEDL 


6449 


597 


1876 


EYGVCENbRKLtlTGVSCRDVYAKLLHRYRHILGLWQPDIGPYG 
GLLNVVVDGbFI I GMMYLPPHDPKVDDPMRFKPLFR IHbMERKA 
ATVECMYGHKGFHKGHIQIVKKDEFSTKCNOTDHJfRMSGGROEE 
FR TWLR E EWG RTLEDI FH EHMQEL I LMKF 1 YTSQYDNCLT Y RR 1 
YbPPSRPDDbl KPGLFKGTYGSHGLEIVMLSFKGRRARGTK1TG 
DPN1PAGQQTVEIDLRHRI0LPDLENQRKFNELSRIVLEVRERV 
RQE<X?EGGHEAGEGRGRQGPKESQPKPAQPRAEAPSKGPDGTPG 
ErcGF.FGDAVAAAEQPAOCGOGQPFVLPVGVSSRNEDYPRTCRM 
CFYGTGLI AGHG FTSPERTPGVFI LFDEDRFG FVWLELKS FSLY 
SRVOATFRNADAPSPOAFDEWLKNIQSLTS 


6450 


B4t 


26 9 


FVPAPRTVSGKRSL,FGEWEEHGEGEQRTGREFSGNGGRAVEAAR 
KR LLCGLWbW bS LbKVbQAOT PTPLPLP PPMC S FOGNQF^G EWF 
VLGLAGNSFRPEHRALLNAFTATFEbSDDGRFEVWNAMTRGOHC 
DTWSYVbl PAAOPGQFTVDH R VWTHEQAGR PQDQPAGQEbV AAS 
RDAGPVHLPGQSSGPLG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPOCKEDAEEWTYPMRREKOE 
ILPGLFLGPYSSAMKSKLPVL0KHGITH1IC1 RQNIEANF2 KPN 
FWLFRYLVLDI ADNPVENI I RFFPMTKEFlDC-SbQMGGKVLVH 
GNAC I SRSAAFVI A YIMETFGKK YRDAFAYVOE'RR FC J NPNAGF 
VHOLOEYEAiyLAKLTIOMMSPLOIERSLSVHSGTTGSLKRl'HE 
EE DDFGTMOVATAONG 


6452 


1 


652 


RTRGESSNMEPbAA YPLKCSGPRAKVFAVLLS j VLCTVTbFbbC; 
LK FLK PK1 MSFYAFE VKDAXGRTVSLEK YKGK V£L WNVASDCQ 
t TncMVi^f i/c*t v v ttcv*!) outre \rr aPDni^prrcrDbDcifPVPC 

FARKNYGVTFP2FHKIKILGSEGEPAFRFLVD5SKKEPRWNFWK 
YLVUPEGOWKFWRPEEPI EVI RPDIAALVROV 1 1 KKKEDL 


6453 


827 


223 


HRRWLPGLSMS PRRT LPRPLS bCbSbCUTbCLAAALGSAQSGS C 
RDKFvNCKWFSOOEbRKRLTPbQYHVTOEKGTESAFEGEYTHHK 
DPG3YKCVVCGTFLFKSETKFDSGSGWPSFHDV1NSEAITFTDD 
FSYGKHRVETSCSQCGAHbGH i FDDGPRPTGKRYCINSAALSFT 
PADS SG TAJ2GGSG VAS P AQAD KAEb 


6454 


827 


223 


H R R WLPGbS MS F RRTbPR PbS bC bSbCLCbC LAAA bG S AQSGS C 
RDKKUCKWFSOQEbRKRLTPbOYHVTQEKGTESAFEGEYTHHK 
UPGlYKCWCGTPliFXSETKFDSGSGWPSFHDVlKSEAITFTDD 
FSYGMHRVETSCSOCGAHLGH I FDDGPRPTGKRYClNSAAliSFT 
PADSSGTAEGGSGVASPACADKAEL 


6455 


104 2 


173 


R VHbATVS AS AAWDAbGLP VRS HMQGS TRRMG VMTDVH RRFbQb 
LMTHGVbEEWDVK KbQTHCYK VHDRNATVDKLEDF J NNI NS VbE 
SbY I El KRGVTEDDGRPI YAbVNbATTSISKMATDFAENEbDbF 
R KAJUEL J 1 DS ETG FAS STN I LKL VDQbKGKKMR K K EAEQVbQKF 
VONKWLIEKEGEFTLHGRAIbEMEQYIRETYPDAVKlCNICHSb 
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BNSDOCID: <WO. „0153312A1J..> 



WO 01/53312 



PCT/13SM/34263 



2D 
NO: 


Predicted 
beoinninc 
nucleot i dt 
location 
corresponding 
to first 
arrdno acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
zmino acid 
residue of 
ammo acid 
sequence 


Amino ecid sepment containing Signal peptide 
(A-AIamne, OCysteine / D=Aspartac Acid, E- 
GluttJir.ic Acic, F= Phenylalanine , G=Glycine, 
H=Histidine, 3=lsoleucine, K=Lysane, 
L=Le\;cane, M=Methionine, N=Asparagint , 
P=Prolane, Q=Glutamine, R=Arginine , 
S=Serme, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unkncwn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




i 
i 


LlOGCSCETCGIRKHLPCVAKYFOSNAEPRCPHCNDyWPKEIPK 
VFDPEKENESGVLKSNKXSLRSRQH 


6456 


2 


555 


RPQS3S I SMWRNSLLQVSSGLRWLRVCAMVD1 LGERKLVTCKGA 
TVEAErJ\LONKVVALyFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSS0EMLDFMRELHGAWLA1.PFHDPYRKELR 
KRYNVTA1 PKLVI VKONGEVITNKGRKQJRERGLACFODWVEAA 
DIFONFSV 


6457 


23 


892 


PTTGFPVTMFPWNWPDGKPPIMJLYVSKLNKIiHFFDFDKKIPV 

VI PDT dt f wrUU7Cf!T OCTCVT.CT DMPTVT tV'PTToi ""1 T T FT 

11LGK0YSLNIILSVFAI1LGAF1AAGSDLAFNLEGY3 FVFLND 
IFTAANGVYTK0KKDPKELGKYGVLFYNACFM1 1 PTLI I5VSTG 
Dl^OATEFNQWKNVVFIbQFlibSCFLGFLLMYSTVbCSYYNSAL 
TTAWGA I KNVS VAY1 GT LIGGDYI FSLLNFVGLN2 CMAGGL.R Y 
SFLTLSSOLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNK3 I HFPDFDKXI PV 
KIjFPI^PLIjWGNHISGLSSTSKI^LPMFTVLRKFTIFLT 
IILGKQYSLN3 ILSVFAI ILGAFIAAGSDLAFNLEGY2 FVFLND 
IFTAANGVYTK0KMDPKELGKYGVLFYNACFM3 1 PTI J 1 SVSTG 
DLOOATEFNQWKNWFI LQFLLSCFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVSVAY1GI LIGGDYJ FSLIjNFVG LN1 CKAGGLR Y 
SFLTLSSCLKPKPVGEENICLDLKS 


645$ 


23 


892 


P TTG F PVTNF P WM W PDG K P P I Ml L YVS KLNK 3 ■ HFPDFDKKIPV 
KLFPLPLLWGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGY3 FVFLND 
IFTAANGVYTKOKMDPKELGKYGVLFYNACFMT I PTL3 3 SVSTG 
DliOCATEFNOWKNWFILQFIiLSCFLGFLLKYSTVLCSYYNSAL 
TTAWGAI KNVSVAYIGI LI GGDYI FSLLNFVGLNI CMAGGLRY 
SFLTLSSOLKPKPVGEENICLDLKS 


6460 


23 


892 


PTTGFPVTNFPWWWPDGKPPIMILYVSKLNKl } HFPDFDKXI PV 

KLFPI.PLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 

IILGKOYSLNI 1LSVFAI ILGAFIAAGSDLAFKLEGYJ FVFLND 

I FTAAJ4GV7TKQKMDPKELGKYGVLFYNACFM j I PTLI 3 SVSTG 

DT .nftATF rNOUKH\A/F] I »OFLL.^CF1jRFI .1 MY c ^Lr^WNSAl. 
uuyyn i l, i iivtnjjvi* v vr iuyr ui-jov-. t.ljv7 r jjut u v. * yu\_j»j i iioaui 

TTAWGA I KNVS VAY1 G I L I GGDY I FS LLNFVG LN I CMAGGLRY 
SFLTLSSOLKPKPVC-EENICLDLKS 


6461 


1653 


360 


LQQRTLKITAVGOTHPIAWMAWEPSLGAFYGPASFITFVNCMYF 
i^IFIOLKRHPERKYELKEPTEECQRLAANENGEINHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWhjFGALAVSLYYPL 
DLVFS FVFGATSLSFSAFFVVHHCVNREDWLAVJIMTCCPGRSS 
YSV0VWVOPPKSNGT>5GEAPKCPNSSAESSCTl^KSASSFKNSSQ 
GCKLTN LQAAAAQ CHANS LP LNS TPQLDNS LTE H SMDND 3 KMHV 
APLEVO FRTNVHS S RHHKNRS KGRRAS RLTVLKE YAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYREROYNPPQODSSDAC 
STLPICS S RNFE KPVSTTS KKDALRKPAWE LENQQKS YGLNLAI 
QNGP1 KSNGOEGPLLGTDSTGfT/RTGLWKHETTV 


6462 


3 


773 


SEELDREKKLKEDS PRKTPNKESGVPSLPVSLTS I KESPKEAXH 
PDSOSNEESKLXNDDRKTPVNWKDSRGTRVAVSSPMS0KOSYI0 
YLHAY PY PQMYDPS HPAYRAVS PVLWHS YFGAY LS PG FHY PVYG 
KMSGREETEKVT5TSPSWTKTTTESKALDLLOt>KANQYRSKSPA 
PVEKATAER ER EAER2RDRHS PFGQRHLHTHHHTHVGMGy PLI P 
GQYTJP FOGLTS AALVASQOVAAQASASGMFPGCR RE 


6463 


2 


350 


VILCI LGGW3 FKNADRSMEKKKGEPRTRAEARP V.'VDEDLKDSSD 
LHQAEEDADEWQES E ENVEH I P FSHNHYPEKEMVKRS QEFYBLL 
NKSRS VR F I SNEQV PMEV I DNV I RTAGL 


6464 


12 j 1154 


GILROKEREERNRIHKKEILFLEHLLWPSEMSSLSGKVQTVLG 
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BNSDOCIO- <WO....0l53312At_l_> 



WO 01/53312 



PCT/l'SOO/34263 



SEC 
ID 

NO: 


Predi c:Lec 
beginning 
nucl eotiric 
lccp-t ior. 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Anur.c acic seoment contain: no sicna) peptide 
(A-vYianinc , C-Cysteine, D^/.ppartic 7*cid, E= 
Glutamic Acid, F^Phenylalar.nnc , G=Glycine, 
K-Histidine, 1=1 soleucine , K-Lysine, 
L=:jfucine, M=Methionine , N=Asparacme , 
P= yrol ine , Q-Gl ut amine , R=A2"oinine , 
5 = .Ser^ne, T=Threonine, VcVsime, 
W=Tryptophan, Y^Tyrosine, X-^Unknown, *=Stop 
Codon, /=possibie nucleotide deletion, 
\=poseible nucleotide insertion) 








LVE PS Kl^KTLTJJEKLAMTFDCCYCPP PPCQEAJS KEPI VMKNJL 
YW 1 QKNAYSHKENLQLNQETEA1 KEELl.Y FKANGGGALVENTTT 
GI SRDTQTLKRLAEE7GVHI 1 SGAGFYVDATHSSETRAMSVEQl* 
TDVLMNEILIIGAJ^TSIKOGIIGEIGCSWPLTESESKVLOATAH 
AQAQLGCPVI I HFGRSSRAPFQ1 IRI LOEAGADJ S KTVMSHl/DR 
1 1 LDKKEIiLEFAOLGCYLEYDLvGTELLhjQLGPDlDMPDDNKR 
1 R R VxLIA' EEG CEDR I LVAHDI HTKTR LMK YGGHGYSH I LTNW 
PKMLLRG1TENVLDKILIENPKQWLTFK 


6465 


126 


1396 


KKTVFFKTLRNHWKKTTAGLCLL'rWGGHWLYGKHCCWLLRRAAC 
OEAOVFGNOMPPNAQVKKATVFLNPA^.CKGKASTLFEKNAAPl 
LHLSGMDVTI VKTDYEGQAKKLLELMENJTDV1 1 VAGGDGTLQEV 
VTGVLPPTDEATFSKIPIGFIPLGETfSLSHTLFAESGNKVQHI 
TOATLA I VKG ETVPLD VLQ1 KGEKEQP V FAMTG LRWGS FRDAG V 
KVS KYWYLEPLKI KAAHFFSTLKEWPOT.HOASISYTGPTERPPN 
E PE ETPVQRPS LYRR I LRRLAS YWAQPODALSOE VS PE VWKJDVO 
l.STI ELS 1 TTRNNOLDPTS KEDFLNI CJ t PDT1 SKGDF1 TIGSR 
KVR^PKLKVEGTECLOASQCTLLIPEGAGGSFSIDSEEYEAMPV 
EVXLLPRKLQFFCDPRXREQMLTSPTC 


64f>£> 


1134 


828 


VAR GTELSQLEKAHP PADMGRRK SKR KP P F KJtKMTGTLETQFTC 
FFCNHEKSCDVKMDRARNTGV1SCTVCLEFFQTPITYLSEPVDV 
YS DVJ I DACEAANQ 


6467 


301 


2571 


GELRVLAlAHGELACHAVLTASLLSLRiRLMDSDMDYERPffVET 
IKCVWGDNAVGKTRLICARACNATLT^Y0L1*ATHVPTVWAID0 
YR VCQEVLER S RDWDDVSVSLRLWDTF GDHHKDRR FAYGRSDV 
WLCFS I ANPNSLHHVXTMWYPE I KHFCPRAPV I LVGCQLDLR y 
ADLEAVNRARPPLARFIKPNEILPPEKGREVAKELGIPYYETSV 
VAOFGIKIJVFDMAIRAALISRR^LQFWKSHLRNVORPLLOAPFL 
PPKPPPPllWPDPPSSSEECPAHLLEDrLCADVILVLOERVRl 
FAHK1Y LSTS S SKFYDLFbMDLSEGELGC- PSEPGGTHPEDHQGH 
S EQH H HH H31H \ \ HG R DF LLRAAS F DVC ES VDEAGGS G PAGLRAST 
SDGI LRGNGTG YLPGRGRVLSSWSRAFV S 1 QEEMAEDPLTYKSR 
LMWVKMDSS 1 QPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFPLR MKVANI LNNEAFtWQEITKAF «VRRTNRVKECLAKGT 
FSDVTFILDDGTISAHKPLLISSCDWMA/vMFGGPFVESSTREW 
F P YTSKS CMRAVLEYLYTGMFTSSPDLDDMKL.I I LANRLCLPHL 
VALTEQYTVTG LMEATQMMVDI DGDVLV FLELAQFHCAYQ1*ADH 
CLHK J CTWrnvrvCRKFPRDMKAMSPENCEYFEKHRWPPVWYLKE 
ZrUH I S/KAKKtK tKttJJhHL>r>J<\J}r AKKWljf- WKi>F^i r s^onAooo 
SP5SSSAW 


646B 


3 


1374 


D AW AGTN MAAbAP VGS P AS R G PRLAAG L RLLPMLG1»L>QLI*AEPG 
LGRXWIIALKDDVRHKVHLNTFGFFKDGYMVVNVSSI^SLTIEPED 
KDVT3 GFSbDRTKNDGFSSYLDEDVNYCl LKKQSVSVTLLItDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGOSOBPNVNPASAGN 
0TQKTQE)GGKSKRSTVDSKAMGEKSFSVHNNGGAVSFOFFFNIS 
TDDOEGbYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GEIPLPK1,YI£^FFFF1>SGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYY3THLLKGAXLF 
I TIA U GTGWAF1 KHILSDKDKX1 FMI V J PRR VLANVA YI 1IES 
TEEGTTE YGLKKDS 1»FLVDI*LCCGAILFPWWS I RHLQEASATD 
GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTKMAALAPVGSPASRGPRLAAGLR LLPMLGWjOLLAEPG 
LGRVHHLALKDDVRHKVHLNTFGFFKDGY MWNV S SLS LNEPED 
KDVT1 GFSl»DR*TKNDGFSSyi»DEDVNYCl LKKQSVSVTLLILDI 
SRSEVR VKSPPEAGTQLPK1 I FS5DEKVIGQSQEPNVNPASAGK 
0TQXT0IKX3KSKRSTVDSKAl^EKSFSVK? ! n , IGGAVSFQFFFNIS 
TDD05GLYSLYFHKCLGKELPSDXFTFSLD1EITEKNPDSYLSA 
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BNS0OCID: <WO 0153312A1J_:> 
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SEQ 
ID 


"~Pre c: ctcc 
bcroinnina 
nuc-ieot i dc 
location 
corresponding 
to first 
ammo acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca t i cn 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seginent containing sigr.ai . pept ide 
(A-Alanine, C-Cysteine, D-Aspartic Acid, E = 
Gl ut 3mi c Acid, r =Pherjyi al ani ne ( G=Glvcinc 
H=Histidine, 2 =lsoleucine, K=Lysane, 
L= Leucine, M=Methionine , N=Asparagine . 
P=Froline, 0=Giutamine, R^Arginine, 
S- Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *-Stop 
Codon, /^-possible nucleotide deletion, 
\-possible nucleotide insertion) 








G :EI PLPKLY I SMAFFFFLSGTJ W I HI hRKRRNDVFK 1 HWLMAAIi 
P FTKS LS L V FHA1 DYHYISSQGFPI EGW AW Y V 1 THLLXG ALL F 
I I IkLIGTGWAc I KmuSDKDKKI r MI VI PRKVLA1WAY 1 1 IES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPVVWSIRHLQEASATD 
CKGKFSRAHFVXLSLL 


64/0 


272t 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPGTPRVPGSRRHPSAFRS 
GPLPREDGCRTPGPQLLPLPGALLKPRTLLSSAAETGRSRHPDT 
OHPSSGGRCRGGTBSPSSAAGRPASriAEAEEDCHSDTVRADDDE 
ENESPAtTDLOACLCMFRAOWMFELAPGVSSSNLEKRPCRAARG 
SLQKrSADTKGKCEOAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPm EFK1 TYTRSPEGDGVGNSYIEDNDDDSKKADLLS 
YFO(X?LTFQESVLKLCOPELESSOJHJSVLPMEVLKYIFRWVVS 
SDLDLRSLEQLSLVCRGF Y1CARDPEIWRLACLKVWGRSCIKLV 
j PYTSWFEMFLERPRVRFDGVY1SKTTY1RQGE0SLDGFYRAWH0 
VEYYRYIRF FPDGH VMMLTTP EE PQS I V P RLRTR 


6471 


175C 


2SS 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDFALRRRRR 
: GPRNKKRGWRRLAQEPLGLEVDOFLEDVRLQERTSGGLLSEAPN 
i EKLFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDLl LENTS K 
1 VPAPKDVIJ^QVPNAKKLRRKEOLWEKLAKOGELPREVRRAQAR 
I LLNPSATRAKPGPODTVERPFYDLWAEDNPLDRPLVGODEFFLE 
OTK KXG VK R PAR LHTK PS QAPAVE VA PAG AS YN P S F E DHQTLLS 
AA5 IEV E LQR QKEAE KLERQLALP ATEOAATOESTFOELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 
WRl»RVC^AALRAAKLRH0ELFRLRGIiO\0VALRLAELARR0RR 
R0ARREAEADKPRRLGRLKYQAPD1DVQLSSELTDSLRTLKPEG 
m L RDR FKS FOR RNM 3 E PRERAK FKR KY KVKLVEKRAFR E I QL 


6472 


3 


8 97 


S CG S D RAQ W AME F P FD VDA L F PER I TVLDG/H LR P PAR R PGTTT P 
ARVTDLQQQ1 MT1 IDELGKASAKAQNLSAPITSASRMQSNRHWY 
ILKDSSARFAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCIL 
DFyiHESVORHGHGRELFQYMLOKERVEPHQLAlDRPSOKLLXF 
LNKFIYMLETTVPOVNNFVI FEGFFAHQHR PPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 
RAPRRATPPAHPPPKSSSLGNSPERGPLRPFVP 


64 7 3 


22 


912 


?SAVFFVWEGEKMAAEPNKTEIOTLFKRLRAVPTNKACFDCGAX 
NPSWASirYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFO 

U5SAAXARHGTDLVJIDNMSSAVPNHSPEKXJDSDFFTEHTOPPAW 
DAPATEPSGTQQPAPSTESSGLAOPEHGPNTDLLGTSPKASLEL 
KSSJ3GKKK P AAAKKGLGAKKGLGAQKVS SQSFSEI EROAOVAE 
KLREOOAADAKKOAEESKVASMRLAYQELOIDR 


64 74 


3 


462 


L0ROR0HPAAAPAVPVRCFTFCFTD1VIMPKRXSPENTEGKDGS 
KVTKCEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQI ETVRVKGTEN 


647S 


3 


462 


LORQRQH PAAAPAV P VRCPTFCFTDI V 1MPXRKSPENTEGKDGS 
KVTKQE PTR R S ARLS A K PAP P KPE PKPRKTS AX KE PGA K I SRG A 
KGKKEEKQEAGKEGTAPSENGETKAEE3HISRSTVNVSTSRGTP 
PSTVSVKGQI ETVRVKGTEN 


6476 


106 


1090 


ARAMAQY KGTMREAGRAMHLLKKR ERQREQME VXKQR I AEET1 L 
KSOVDKRFSA^YDAV^ELKSSTVGLVTLNDMKARQEALVRERE 
ROLAKROHLEEQRL0OSRCREQE0RRERKRKISCLSFALDDLDP 
gADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAOR EKVKDE EMEVT FS YWDGS GHRRTVR VRKGNTVQO FLKKAL 
CGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIA^^ARGK 
SG PLFSFDVHDDVFLLSDATME KDESHAGKWLRSW YE KN KHI F 
PASRWEAYDPEKKWDKYTIR 



509 



BNSDOCIO: <WO_ 01W3I2A1J.: 



WO 01/53312 



PO7US00/34263 



SEO 
Hi 
NO- 


Predicted 
beginning 
nucleotide 
i oca t ion 
cor respond i no 
to first 
amino acid 
residue of 
amino acid 
sequence 


Prcdjr:te*tTer;<i^ 
r.ucleot ice 
locatacr. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rAninc acid seancnt containing signal peptide 
1A*= Alanine, OCyoteine, D^Aspaitic Acid, E= 
Glutamic Acid, F-Phenylaiar.me , G=Glycine, 
H-.HisCidine, 2<-!soieu<::ne. K=Lysine, 
^^L-eucine, M=Met hioni ne , tt=Asparagine, 
P~Proline, Q=C1 ut amine , JU/'.rginine, 
S ^Serine, T^Threonine, V^valane, 
W=Tryptophan, Y-Tyrcsine, X- Unknown, *=Stop 
Codon, /=poscible nucleotide deletion, 
Vpossible nucleotide insertion) 


6477 


227 


911 


1. OGHLMG 3 KAASR FliSRFKEWGKH I VCVGRNYADHVREMRSAVL 
. c KPVLFLKPSTAYAFEGSPlL^PAyTRNl,HHELEl,GWMGKRCR 
AVpEAAAMDyVGGYALCI.nMTAKDVODECKXKGLPWTLAKSFTA 
SCPVSAFVPKEXI FljPHKLKLWLXWGELRQEGETSSMI FS IPY 
3 ISYVSK2 ITLEEGPl I LTGTPKGVGPVKENDEI EAGIHGLVSM 
TFKVEKPEY 


647B 


2 




FVSSR1L-PES LAS SEAS T1»E AKG K K EEIiDCS S W KKQTTN I RKT F 
3 FKE VLGSGA FSE VFL VKOR LTG KLFAL KCIKKS PAFRDS S LEK 
7. 3AVLKKIKHENI VTLEDI YE.S'J 'l'HYYLVMOLVSGGELFDRILE 
RGVYTEKDAS1.VI00VLSAVKYLHKNG1VHRDLKPENLLYLTPE 
F.NSK IMJTDFGLSKME0NG2 MSTACGTPGYVAPEVIAQKPYSKA 
Y'DCWSIGVITYILI/TGYPPFYEETESKLFEKI KEGYYEFESPFW 
UD3SESAKDFI CHLLEKDPNERYTCEKALSKPWI DGNTALHRDI 
Y PSVS103 OKN FA KS KWRQA FNAAAVVKHMR KLHMNLHSPGVRP 
HVENRPPETQASETSRPSSPE2T3TEAPVIDHSVAL»PALTO^PC 
OH G R R PTAPGG RS LNX L VNG S LH I ^ S S L V F M HOG S LAAG P CGCC 
?SCLNIGSKGKSSYCSEPTl,l,KKANKKCNFKSEVMVPVKASGSS 
HCRAGQTGVCLJM 




2 

■ 


S4< 


^CRGPGWHPAGGOAGAMELLSALELCELALSFSRVPLFPVFDLS 
YFIVSIl.YLKYEPGAVELSRRIIPIACWLCAMLHCFGSYILADLL 
LGEPI.IDYFSMNSS1 LLASAVVIYLI FFCPLDLFYKCVCFLPViO, 
J FVAMXEWRVRKI AVG I HHAHHK Y KHGWFVM I ATGWVKGSGVA 
U^SNFEOLliRGWIKPETNE I LHMSFPTXASLYGAI LFTLQQTRW 
LPVSKASLIFIFTLF^VSCKVFLIATHSHSSPFDAJLEGYICPVL 
FGSACGGDHHHDNHGOSHSGGGPGACKSAMPAKSKEELSEGSRK 
KKAKKAD 


6480 


192 


514 


Er74SI YFPIHCPDY LT^SAJCMTEVMMNTQPMEEIGLSPRKDGLSY 
01FPDPSDFDRCCKI.KDRLPSIWEPTEGEVESGELRWPPEEFL 
VOEDEQDNCEETAKEKKEO 


64 81 


110 

L .. . 




K S RMDLJ)WNMFVI AGGTLA I P I LAFVAS FLLWP S AL.3 RJYYWY 
HR RTbGMQVRYVHK EDYQFCYS FRGR PGHK PS 3 LMLHGFSAHKD 
MWL»S\TVKFLPKNLHLVCVDWPGHFGTTRSSLDDLSIDGQVKRIH 
CF ^ E CLX 1>N KKP FH LV GTS NG G OVAGV Y AAY VPS D VS S LWL VC P 
AGL0YSTDNQFVQRLKEL0GSAAVEK3PLIPSTPEEMSEML0LC 
SYVRPKVPQQII.QGLVDVRlF/rtWFYRKLFLEIVSEKSRYSLHO 
KKDX1 KVPTOH WGKQDOVLDVSGADMLAKS IANCQVELLENCG 
HSWMERPRKTAKL3 3 DFI ASVHNTDNN KKLD 


6482 


2517 


56e 


E P V SKVSOSRRKAGV FTAN 3 EE S OA VE/iAI4ANVPWAE VCEK FQA 
ALALSRVELHKNPEKKPYT<SKYSARALLEEVKAI*IjGPAPEDEDE 

rpeaedgpgagdiim^glpaewepegpvaoravrlaviefhlgv 
nh1dteelsageehlvkclrllrryrlskdcislciqa0nni/5i 
lwsereeietaqaylessealynoymkevgsppldpterflpee 
e k l»te0er s kr fe kvy thnl y y laqv yqh lem f ekaah ychs tl 
xrolehnayhpiewa:nmtlsofyinklcfmsarhci,saanvi 
fgotgkisatedtpeaegbvpelyhorkgeiarcwikycltlmq 
naobsmqdnlgeldldkoselraurkkeldeeeslrkkavqfgt 

GELCDA3SAVEEKVSY LRPLDFEEARELFIAjGQHYVFEAKEFFQ 
3 DG YVTDH I E WQDH S A L FKGLA FFE TDM ER R C KMHKRR I AMLE 
PLTVDLNPQYYLLVNRQ I QFE3 AKA Y YDMMDLKVAI ADRLR DPD 
SHI VKKINHLNKS ALKY YQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLA K FRVARLYGKJ 1 TAn?KKELENIATS*bEHY KFI VDYCEKHP 
EAAQEIEVELELSKEMVSLLPTKWERFRTKMALT 


6463 


3 


623 


NSHLLCGLRAJ^PI^A^GREARAWEQRLAEFRAARKRAGLAAQP 
PAASC^AQTPGEKAEAJ^ATLKAAPGWLKRFLVWKPRPASARAOP 
GLVQEAAQPOGSrSETPWNTAI PLPSCWDOSFLTN3 TFLKVLLW 
LVL>l,GLFVEIaEFGLAY FVLS LFY WMYVGTRGPEEKKEGEKSAYS 



510 
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•D 
WO: 


Predi ct cd 
beginning 
rmcleot ice 
1 oca t i ofj 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuc*i ectide 
locr^t i or. 
co* responding 
to first 
amino acid 
residue of 
am: no acid 
sequence 


Amino acid segment containing sicnai peptide " 
(A-Alanine, C --Cysteine, D=Aspartic Acid, E- 
Gluteroic Acid, F= Phenylalanine , G=Glyc}ne, 1 
H=Hist i dine, } - 1 sol eucine, K-Lysine, 
b=Leucine, M=Net hicnine, N-=Asparacine , 
P=Proline, Q-Glu: amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=iyrcsine, X^Unknown, *=Stcp 
Codon, /-possible nucleotide deletion, 
Vposrible nucueotide insertion) ' 








VFNPGCEAlOGTLTAEObERELOLRPlAGR | 


6484 


201 


965 


OUVKTKWSGLRPGT0VDPEI ELFVKAGSD6ESIGNCPFCQRLF 1 

DFlKIEEFl.EQTLiAPPRy PHIjSPKYKESFDVGCNIjFAKFSAYJ K 
N?QKEANKNFEKSLIjKEFKJ*LDDYLNTPL1J)EIDPDSAEEPPVS 

rrlfldgdqltladcsllpklniikvaakkyrdfdipaefsgw 
rylhnayareefrhtcpedkel entyanvakqks r 


64 8b 


6 


-1 0 5 j 


FVDL>VRAVEFLPCFDS0KLEK£C0SSEFSMGSNSK-RS1 leedee 
DEEPPKVl.LYHEPRSFEVGMLWJHKHKKYPFWPAWKSVRQRuK 
KASVLYJEGHMNPKKKGFTVSLKSI.KHFDCKEKQTLLNQAJ^EDF 
NODI CWCVSb ITDYR VRLGOGS FAGS FLEYYAADI S YPVRKS 1 Q 
0 PV L.GTK 1»P0 LSK G S P E E P WG CP1.G0R 0 P GR KMLPDR S RAAKD 
RANOKLVEYIGKAKC-AESHLRAILKSRKPSRWLQTFLSSSQYVT 
CVET YLEDEGOliDLWK YLOGVY0EVGAKVL0RTNGDR IRFILD 1 
VLLPEA13CAISAGDE\T)YKTAEEKYIKGPSl.SYREKEIFr)N0L 
bEER-NRRRR 


6466 


10 


b81 


LVLQAGGAHLSPSRVTQGI YYMLAFSEKPKPPDYSELSDSLTLA 
GGTGRFSGPLHRAWRMMNFRORMGWIGVGLYLLASAAAFYYVFE 
I S ET YNR LALEHI QCH P EE P LEGTTWTHS L K AQLUSLP V W VWTV 
I FLVPYLQKFLFLYSCTRADPKTVGYCI 1 PICbAVlCNRHQAFV 
KASNQISRLQblDT 


6467 


352 


863 


sflkplrgkmsvtlhtdvgdjk3evfcertpktcenfialcask 
yyngcifhrnikgfmvotc;dptgtgrggnsiwgxkfedeyseyl 
khwrgwsmanngfntngsoffitygkcphldmicytvfgkvid 

GLETbDEbEKLPVNEKTYRPLNDVHIKDITIHAliPFAQ 




87 


24 j 


TALOEFGTSGPPLSLRFALPSGTGRFKPLFGARGPSKPPSPRVP 
MEPPNLYPVKLYVYDLS KG1 iARRLSP 1 MLGKQbEGl MKTS I VVH 
KDEFFFGSGGISSCPPGGTbLGPPDSWDVGSTEVTEEIFLEYL 

ss lgeslfrgeaynlfe}incntfsnevaofltgrkipsy:tdlp j 

SEVLSTPFG0AI.RPL1DS I QI QPPGGSSVGR PNGQS 


6489 


1457 


37 5 


KVAXMATALSEEEL>DNEDYYSU,NVRREASSEELK^AYRRL»CML I 
YHPDKHRDPELKSOAERLFNLVHQAYEVL5JDPQTRAIYDJ YGKR ! 
GLEMEGWE\A/ERRRTPAilREEFERLQRERJEERRLOORTKPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPOI EINKMHI SCS I EAPL 

1 M i Li A -A i uDol) A-O i \}N v> v> v> o X r*r/\l_>KK V X oAJ\'aWVjt'Jt>L r oHu 

DLQGPbFGbKb FRNLTF RCFVTTNCALQFSSRG J R PGLTTVLAR 
NliDKNTVGYLOWHCSSFLLOVORPHRNTRACAPEPSFRPFLHVP 
VWDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHOLLLbTPR 
SKRRTGGG 


6490 


3 


1383 


H E1AG CE VW LG YG P RAAAAAAAT VLFGGAG P TF:tw FVARS 3 AADH 
KDLIHDVSFCFHGRRMATCSSDOSVKVWDKSESGDWHCTASWKT 
HSGSVWRVTWAHPE FGQVLAS CSFDRTAAVWEE 3 VGESKDKLRG 
OSKWVKRTTL VDSRTS VTDV K rAPKHMG I .MbATCS AUG 1VRIYE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAf4AKV03FEYNEWTRKYAXAETLMTVTDPVHDIAFAPNL 
GRSFH I LAI ATKDVP 1 FTLK PVRKELTS SGGPTKFE I HI VAQFD 
KKNSQVWRVSWN1TGTVLAS SGDDGCVRLK KANYMDNWKCTG I h 
XGNGS P VNGSS QQG TSN PS LG SNI PS LQNS LNGS S AGR KH S 


6491 


3 


11B3 


heagce vwlg yg praaaaaaatvlpggagptetmfvars i aadh 
kdi>ihdvsfdfkgkrmatcssdqsvkvwd:<sesgdmhctaswkt 
hsgsvwrvtwahpefgovlas csfdrtaavwee i vgesnc klrg 
osfwkrttlvdsrtsvtdvkfapkhmglmlatcsadgivriye 
apdvmnls0wslqheiscklscsciswnpsssrahspm1avgsd 
csspnamakvqifeywentrkyakaetlmtvtdpvhdlafapnb 

GRSnillAIATKDVRimKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQV^RVSWITGTVLASSGDtXXTVRbV.'K^ 
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SEO 
ID 
NO: 


P re die tec- 
b^g irmi nc 
r.ucleot in. 
location 
corresponding 
to firs: 
amino acic. 
residue c' 
amino acic 
sequence 


Predicted end 
r.uclect idc 
location 
correspond;, ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami no tsciic F.ev.meDt containing signal peptide 
(A-Al&r.in* , C-Cysteine, D^Aspart ic Acid, L- 
GJutamic Acic;, F=Pheny)alanine, GsGlycint , 
JUHistidine, 1= Jsoleucine, X=Lynine, 
L=Leucine. M=Ne thionine, N=Aspa raoint , 
F^Prolir.fc, C=GJ ut amine, R^Arginine, 
S=Serir>e, ?= Threonine, V«=Valine, 
W=Tryptopnan, Y=*?yrocine , X^Unknown, *=Stop 
Codon, J~ possible nucleotide deletior., 
Vpossifcie nucleotide insertion) 




l | KGNGSPVKGSSO0GTSNPSUGSN1PSLQNSLNGSSAGRKHS 


6492 




2 573 


itiri i/ c r*t v 'f ~r~i c run iic m r\nwnrp?nw irnuo»*»i y/— t> ni/K r^r^TJ 
Jrr .LKbt <-LI>r L'r Jr i^l- rivUy vytfciLCEVfcRVTr,HGTPKPFRK 

FDS VA FGES OS E DEC FENDL ETDP PNWQQLVS R EVLL GLKPCE1 

KROEVINELFVTL'RAlfVRTLKVLDOVFyQRVSREGILSPSELRK 

JFSNLEDjLO^i-'-'GLNE^KAVRWlNETSVJDOlGEDLLTWFSG 

PGEE KLKHAAATFCSNCT FALEM3 KSRQKXDSRFQ7FV0DAESN 

FLCRRLQLKDl i PTQMQRLTKYPLLLDNIAT 1 7EWPTEREXVXX 

AADHCRO I l,N Y VUG AVXEAENKCRLEDYQRRLDTSSLKLS EY PN 

VEELRNLX'LTK K KM1 HEGPLVWKVNRDKTIDLYTLLLt DI LVLL 

CKQD DR LA/L R CH S K I LAS TADSXKTFS PVI KIS TVL VR QVATDN 

KALF V I SMS PNG AO 1 Y EL VAQTVS E KTVWQDL3 CRMAAS VKEQS 

TKPI PLPCfTPGEGDNDEEDPSXLKEEQHGISVTGLOSPDRDLG 

lestl1sskpc5hslstsgksevrdlfvaer0fakeqktdgtlk 
evge dyq3a 1 pl^shlpvseerwaldalrnlgllkqi.lvqolglt 
f,ksvoedwchfpryrtasogpqtdsvionsenikayhsc;eghmp 
frtgtgd3atcysprtstesfaprdsvgiapodsqasn3lvmdh 
mimtpemprkepegglddkgehffdareahsdenpsegdgawk 
eekdvnlri^gkylildgydpvqesstdeevassltlofmtgip 
avesthoochsyqktllsdgaispftpeflvqcrwgamfyscfei 

QSVSRCADSOSV1 MEY 3 HX3 EADLEHLXKVEES YTI -XQRLAGS 
AXTDKHSDK5 


€493 


SE>7 


114 7 


TPARMAY03SSTSDa^SKTLDSASAHFAASAWSAPVPSRSEVA 
KECNTGHNN 1 KG WOPSG TS KTLYS TNMALS SS PGI S A VQLVRT 
VGHTTTNHL3 PALO S5P0 rLPWWSCLTNAVhLNNVSVvSPVN 
VHINTRTSAPSPTALKXATVAASMDRVPXVTPSSAISSIARENH 
EPERLGLNG1AETTVAMEVT 


6494 


9 4 *> r 


1052 


IVV/Anf" LDDCCTDQQ P'-JI?DrCDHDI>l)DT PDPDailMCIt CZiWVI.n 
HVn\)v>AArvc J roo r . i r\_r\ V_ K rt rlfw Ariir i\ Jt u 1 r*Jc.rt o/A VI V kJU 

LKGKVLI CRN's* RGDVDMSEVEHFMP1 LMEKEEKGMLS PI LAHGG 
VRFKW I KHNN LYLV ATSKKNACVSLV FS FLY KWQVF SE Y F K EL 
SEES3 RDNFV2 arELLDELMDFGYPQTCDSKILQEYITOEGHKL 
ETGAPRPPATVTNAVSWRSEGI KYRKWEVFLDV1 BSVNLLVSAN 
GNVLR SEI VGS : H r <RVFLSGMPELRLGLNDiCVLFDNTGRGKSKS 
VELEDVKFHOCVRLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
PL1W3 ESV3 EKKSR SKI EYH I KAKSQFKRRSTANNVE 1 H 3 PVPK 
DADSP KEKT7VGS V KKVPENS EI VrtS 3 KSFPGGKE Y LKRAHFGL 
PSVEAEDKEGKF -F 3 SVKFE3 PYFTTSG3QVRYLFU 3EKSGYOAL 
PWrVRYITQNGDYOLRTO 


6495 


2425 


2052 


AVAGG7U?PCSTF5.SJ'HRRCRRHRPRPLPRPPAA1KSASAVYVLD 
LKGKV L I CRJ? YR G D VDM S EV E H FMP 2 LMEKE EEGM LS P I LAHGG 
VR FMW J XHNNLY L VATS KKNACVSL VFSFLY KWQVF SK YFKEL 
EEES I RPNFV3 3 YELLDELMPFGYPQTTDSK2 LQEY3 TQEGBKL 
ETGAPFt PPAT VTKAVSWRSEG 3 KYR KNEVFLDVI ES VN LI.VSAN 
GWVLRSE1 VGS 3 KMRVFLSGMPELRLGLNDKVLFDKTGRGKSKS 
VELEDVKFHCGVFLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
PLIWIESVIEKJ:FHSR1EYM3KAXS0FKRRSTANWE3HIPVPN 
DAJDSPKFKTTVG SV K VrvPENSE 3 VWS I KS FPGG KEYXMRAH FGL 
PSVEAEDKEGK PPl S VKFE3 PYFTTSG2QVRYLXII EKSGYQAL 
PWVRYrTONGDYCLRTO 


6496 


24"/ 


559 


LRAVSLLPLO^VLFEYSIHSLFCIMFLCAOEWLTLGLKVPLLFY 
HFKRY FHCPAD£ SEl^YDPPVVMNADTLSYCOXEAWCKLAFYLL 
SFFYYLYGM3YTLVSS 


6497 


1053 


352 


ANT01CRLCPRRHLKPPCGAKKGNGTEEDYNFVFKWL1GESGV 
GKTNLLSRF w ?RKEFSHDSRTT3GVEFSTRTVMLGrAAVKAQIWD j 
TAGLER YRA I TS AYYP.GAVGALLVFDLTKHQTYAWER WLKELY | 
DHAEAT I V\^LVGNKSDLS0AREVPTEEARMFAEm«3LLFLETS | 
ALDSTNVELAFETVLKEI FAKVSXQRONS3RTNAITLGSAQAGC ! 
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SEU 
ID 

NO: 


Predicted 
oegmning 
nucleotide 
Zocat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enri 
nuc] ect ide 
locat ; on 
correspendino 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine i.cio segment containing ciyna* peptide 
{A«A]ar.:ne, C-Cyeteine, D=Aspartic Acid, E-- 
Clutarr.ic Acid, F -Phenyl a lani nc , G=Glycine, 
H=Hi sti dine, 1 - 3 sol eucine , K = Lysine, 
LrLe-jci ne , M=Methionine , N=Asparagme . 
P=prclme, Q=Glutamine, R=Arginine, 
S=5erine, 7-Threonine, V=Valane, 
W^Trypt ophan, Y=Tyrosine, X=l'nknown, * = Stop 
Codon, Apossibie nucleotide deletion, 
VpOKeible nucleotide insertion) 




1 


EPGPGEKRACC1SL 


6498 


2636 


2 7/ 
■ 


SLRliCPlvCtHbAGPTTMRLSSLlALLiRPALPLILGLSLGCSLSL 
LRVSKJCCEGEDPCVE/vVGERGGPQNPDSRARLDQSDEDFKPRl 
VPYYRDPKKPYKKVa-RTRYlOTELGSREnLLVAVLTSRATLSTL 
AVAV>JR7VkHH?PRLLYFTG0RGARAPAGK0WSHGDERPAWLM 
SETLRh'L KTH rGADYDWFFlMOPDTYVOAFRLAALAGHLSINOD 
LYLGRAJEEFIGAGEQARYCHGGFGYLLSRSLLLRLRPHLDGCRG 
D 1 LSARPEF. WI ,0RCL 1 DS LGVGC VSQHOGOOYRS FELAJONRDPE 
KEGSSAFLS A F AVHFVS EGTLMYR LHKR FS ALELERA YS E I ECX 
QAOJRNbTVLTPEGEAGLSWPVGLPAPFTFHSRFEVLGWDYFTE 
OHTF^CADGAPKCPLOGASRADVGDALETALEQLNRRYOPRLRF 
CKORLLNGYRRFDPARGMEYTLDIJiLECVTORGHRRALARRVSL 
LRPLSRVEILFMPYVTEATRVOLVLPLLVAEAAAAPAFLEAFAA 

nvleprehalltuilvygpreggrgapdpflgvkaaaaelerry 
fgtrlawlj^vraeafsovrlmdvvskkhpvdtlrflttwtrpg 
pevlnrc7imnajsgw0affpvhfoefnpalsporsppgppgagf 
dfpsp?gadpsrgap7ggrfdrcasaegcfynadylaarartjag 
elagohefealeglea'ndvflrksglhlfravepglvokfslrd 
cspklseheyhrcri^nleglggraolamalfeoeoakst 


6499 


3 


204C 


SCSADTRPSGOAWPa'VGLRAAAGAFRrGSPljALGPETPCVACljP 

ghppvrpqvsggpgampdpaahlp ffygsisraeaeefxklagm 

7UX^LFLEF<C t CljKSljGGY\'LSL.VHDVRFHHFF I EROLNGTYA1 AG 
GKAHCGF7\EI..CEFYSRDPDGLPCNLRKPCNRPSGLEP0PGVFDC 
1_>RDAMVR DYVR0TWK1EGEA1»EQA1 1 SOAPOVEKbl ATTAHERM 
PWYRSSLn^EEAERKLYSGAOTDGKFLLRPRKEQGTYALSLlYG 
KTVYHYLl SODKAGKVC3PEGTKFDTLWOLVEYI.KLKADGL1YC 
L»KEACPNSSASNASGAAAPTLPAHPSTl»T}rPORRIDTLNSDGYT 
PEPAR1TSPDKPRPKPMDTSVYESPYSDPEELKDKKL.FLKRDNL 
LIADI ELGCGNFGSVRCGVYRMRKKQIDVAI KVLKQGTEKADTE 
EMKR E AO 1 MHQUWP Y 1 VRLI GVCQAEAi>ML»VMEMAGGG P LHKF 
LVGKREE 1 PVSNVAEbLHOVShJG^KYLEKKNFVHRDLAARNVLL 
VMRF.YAK i SDFGLS KALGADDSYYTARS AGKV^LKK YAPECINF 
RKFSSRS D VWF YGVT WEALS YGOK PYKKMKGP EVMAFI E (JG KR 
MECFPECPPEIjYALMSDCWI YKWEDRPDFLTVEQRMRACYYSLA 
S KVEG P PG S 1 0 KAE A^CA 


6500 


1773 


72fc 


tgftfiasajdawglvrsvtewcanvrgnpcaaalscpoaVldagk 
mi>sesssflkcvmlgslfcalitmlghirlghgnrmilhhehhh^ 

0APKKEDI LK3 SEDE^MEIjSKSFRVYCI I LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSS ENVKVFES INMDTNDMWLMMR KA YKYAFDK 
YRDCYNWFFLARPTTFAIIENLXYFLLKKDPSOPFYLiGHTJKSG 
DLEYVGMEGGJVLSVKSMKRLNSLIWIPEKCPE0GGM1WKISED 
KQLAVCLK YAGVFAENAEDADGKDVFNTKSVGLSI KEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFC 


6501 


1 


57 0 


LVGMSGGG TE7 FVGCEAAPGGGSKKRDSLGTAGSAHLI I KDLGE 
IHSRLLDHRPVIOGETRYFVKEFHTEKRGLREMRVLENLKNMIHE 
TNEHTLPK CRDTMRDS LSQVLORLOAANDS VCRLQCR EOER KKI 
HSDHLVAS EKCHMLQKDNFMKEnpNKRAEVDEEHR KAMERUKEO 
YAEME KDLAKFSTF 


6502 


213 


16S0 


AGNKPDFtCAGRNRTAVLPDVSVFHREDVGWWRSKLOOSYQAVKE 
KSSEALEFMKRDLTEFTQWQHDTACT J AATASWKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTJDCDV1TLMGTPSG7 
AE P YDGTKAR L Y S LQS D PATYCN E PDG P P E ii F DAWLS Q FCLEEK 
KGE1SELLVGSPSIRALYTKMVPAAVSKSEFWHRYFYXVHOLEO- 
EQARRDALKQRAEQS I SEEPGWEEEEEELMG I SPI SPKEAKVPV 
AK 1 STFFEGEPG FQS PCEENLVTS VEPPAEVTPSESS ES ISLVT 
01 ANPATA PEAR VLP KDLSQKLLEASLE EOGLAVDVG ETGPSPP 
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SEO 
ID 
KC- 


Predicted 
beg. nni nr 
nucleot i ne 
1 ocr. t tor. 
cor re spending 
to first 
amir.o acid 
residue of 
amino acic 
seguencf 


Predicted end 
nucleot i de 
1 oca t ion 
corresponding 
to £ir?t 
amino scid 
residue of 
amino acid 
sequence 


Amino acid segment containing siynai peptiae 
(«V Alanine, C-Cysteine, C=Aspartic Acid, E= 
Glutamic Acid, F=PhenylaIonmfc , G=Giycine, 
l!=!list idi ne , I=Isolcucine , K=Lysine, 
L- Leucine, ^Methionine, N- Aspaiagine . 
P=Prolir.e, O^Glutamirte , R-Arginine, 
S=Serine, T=Threonine , VrV^line, 
w-^Tryptophnn, Y=Tyrosine, X^Unknowr., **Stop 
Codon, /-.possible nucleotide deletion, 
Vpossibie nucleotide insertion) 




i 
i 
i 


3 HS KPLTFAGHTGGPtTRPFARVETLREEAPTDLRVFELNSDSG 
K ST F' SNN G K KG S S TD1 S EDWE KD F DLL MT EE E VQKAL S KVDASG 
EVSGPGGSLGSEPNGPGCESSPUPAOi.SPQEGPCSCLR 


65C3 




1650 


AGK'KPDPWAGRNRTAVLPDVSVFHKEDVGWWRSWLOOSVOAVKE 
K PS ALE F M KR DLTE FTQWQHDTACT 1 AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTJDCDVITLMGTPSGT 
AEFYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSOFCLEEK 
KfJFT SFl.TjVr; «^PS 7 RAT.YTKMVPAAV^HSEFWHRYFYKVHOLFO 
EOAR^DAXKORAEOSISEEPGWEEEKBELMGISPJISPKEAKVPV 
AXISTFFEGEPGPQ.SPCFKNIiVTSVKPPAEVTPSE5SESISLVT 
OIANPATAPEARVLPKDLSQKLLEASLEEOGLAVPVGETGPSFP 
3 K$ K PLT PAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTBEEVOMALS KVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


213L 


1294 


1 GKVC-jV/VhWVCi->Si LSPPPAOMKTPNftQt/%tjijyUlKAAA^K*\ It, 
SAKKiTKKKVSOKKQRGRPSSOPCRNIVGCRlSHGWKEGDEPITO 
WKGTVLDQVP1NPSLYLVKYDG3DCVYGLELHRDERVLSLKILS 
DRVASSH3 SDANLAWTI IGXAVEHMFrJGEHGSKJ^EWRGMVLAQA 
PI M KAWFY 1TYE KDP V L Y M YQLLDDY KEGDLR I M P E S S E S P PTE 
REPGGWrGLl GKHVEYTKEDGSKR 1 GMV3 HQVEAKPSVYFI KF 
DJWHIYVVDLVKXS 




2331 


1294 


GK VCLVAH WVCLS 1 LS P P PAGMKTPRAQEAEGQQTRAA^GRATG 
SANMTKKKVSOK KQRGR PSSOPCRN 1 VGCR 1 SHGWHEGFJEP 1 TO 
WKGT V LDCV P J N PS LY LVK YDG3 DCV y G LEXHRDER VLSLK 1 LS 
DRVASSH3SDANLANTIJGKAVEHMFFGEHGSKDEWRGMVLA0A 

ni iii!/)iLtrvT'r>vpV"\ni/! vMVfM t T»PV I'urni TJTMDrpCPCtiDTP 
PJ rlKAWr j J J ) hKiJPVli j ril yi.»L>Ul/ - A ~i^UJLa IHrfcC/ofcol y l c. 

REPGGWDGL3GKHVEYTKEDGSKR3GMV3HOVEAKPSVYFIKF 

DDDFH I YVYDLVKKS 


65DC 


1 


1350 


EVSFPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCFDCGST 
ELVEDSHYSCSOLVCSCCGCWTEGVliTTTFSDEGNLREVTYSR 
S7-GENE0VSRSO0RGLRRVRDLCRVLGLPPTFEDTAVAYYQQAY 
RHSGI FAARLOKKEVLVGCCVLITCROHNWPL'rMGAICTLLYAD 
LDVFSSTYM03 VKLLGLDVPSLdjAELVKTYCSSFKLFOASPSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFLA 
WOSLQPAPRLSCSUVRFCKLANVDLPYPASSRLQELLAVLLRMA 
EO^WLRVLRLDKRSWKHIGDLLQHROSLWSAFRDGIABVET 
R EKt'P PG WG0G OGEGH VGNNSLGLPOG KR PAS FALLLPPCMLKS 
PKR3CPVPPVSTVTGDEN1SDSEIE0YLRTP0EVRDFQRAC?AAR 
0AATSVPNPP 


650? 


187t 


92$ 


RSrlASRLPFLPSGCLVLQVQELVCMSGKEATVTIPIWONKPHGA 
A^SVVRRIGTNLPLKPCARASFETLFNISDLCliRDVPPVPTLAD 
I AW 3 AADE E ETY AR VR SDTR PLRHTW K PS P L I VMQRN AS VFNLR 
GS ^ERLIALKKPAiPAliSRTTEbQDELSHLRSQI AKI VAADAAS 
ASL.TFDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
VPELPSVPJLLCSASPECCXPEHKAACSSSEEDDCVSLSKASSFA 
DKMG3 LKDFHRMK0SODLNRSLLKEEUPAVL 1 SEVLRRXFALKE 
EDI SRXGN 


6506 


862 


342 


wearkrporwpserrevrVppphlorgksglepgtprki^aaarp 

S LGR VLPGS £ VLFLCDMQEKFRHN1 AY FPQI VSVAARMLKNTTL 
DLLDRGLQVHVVVDACSSRSQVDRLVA1ARMRQSGAFLSTSEGL 
IL0LVGDAVHP0FKEI0XLIKEPAPDSGLLGLFCX5QNSI.LH 


6505 


2 


1053 


F VWN PRGGR KR R RQAAVTQAATRASGTF S PRDGTMT0GKLS VAN 
KAPGTEGQOOVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 
APTAVPLHPSWAYVDPSSSSSYDNGFPTGDHELFTTPSWDPOKV 
RRV FVR KVYT I LLIQLLVTIiAWAiFTF CDPVKDYVOAI-JPG WYK 
ASYAVFFATYLri^CCSGPRRKFPtWLILLTVFTLSKfcAYLTGML 
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SEQ 
ID 
NO: 


Fredl cte< 

nuc]eoL;ct 
locatior. 
corresponding 
to first 
amino ac:ci 
residue c; 
amino uc:c 
sequence 


Predicted end , 
nuclect idr 
location 
corresponding 
to first 
amino acic: 
residue ot 
amino acid 
sequence 


/-• : :jo «:cic segment containing sjonal peptide 
i; -lAJaranc, C=Cyste-ne, l^Aspartic Acid, E= 
G . rom;c Acid, F- Phenylal enine, G=G2ycine, 
K- 5'jsfc joine, I*Isoleucin6 , K=Lysine, 

eucjne, M=Met hionine, N=Asparagine, 
»--rol:ne, Q-Glutarnine, K^Argimnt , 
5r serine, T=Threonine, V^Valine, 
WrTryptopnan, Y=Tyrosine, X=Unknown, *~Stop 
O.cion, /-possible nucleotide deletion, 
\- possible nucleotide insertion' 






" -i 


<< VVNTfSVbbCLGiTALVCLSVrVFSFOTKFDFTSCQGVLKVL 
• f.T'JFFSO L3LAI LLPFQrVPWLKAVYAALGAGVFTLFLALDTQ 
hi '-"GNRRHS liSPEEY 1 FGALN I YLDI J Y 1 FTFFLQLFGTNRE 


6510 


3? 


1156 


PC;,L,DGCI J ORGAVHPLLSSAMGLLAFLKT0FVLKLLVGFVFWS 
Gl/INF^QL.CTLALWPVSKOLyR^LNCRLAYSLWSOLVMLLEWW 
.cr: iJCTI.FTDOATVERFGKBHAVI I LN'HNFE 3 DFLCGWTMCERF 
GV^GSSKVi^KKELLYVPLlGWTWyFLEIVFCKRKWEEDRDTW 
EG* RhS DY P EYMWFLLYCEGTRFTETKHRVSMEVAAAXGLPVL 
K v - : LPRT K G ? TTAV KC LRGTV AAV y DVTLN FRGNKN PS LLC 1 1» 

YC-K K y EADM C VRR f p led i pldekhaaowlhklyoskdalqeiy 

K GKFPGEOPKPARRPWT1.LN FLSWAT1 LLS PLFSFVLG VFAS 
Cn LLI LTFLGFVGAGNGHCR 


6511 


2 " 


1 <i ? S 


On OPIJ& A A FTPCI.FOVTfV^ARnPGTWA*? FP^PI POPAPT.KfJGK 

LL«^ J l L*T^/vr\* 1 UvUCy V i. v^J*\kJ i*/ * \3 1 r> /Vj r I O * lj* V?* /A* l_>rvv_^V? fV 

rrtf.TNFSUJ VKOGYVKMKSRKLGIYRPCWLVFRKSSSKGPQRLB 
KVrDl'KSVCl.RGCPKVTEIShA^KCVTRbPKETKRQAVAI I FTDD 
r/o.j ^CP?ELEAEEWYKTLSVECLGSRLND3 SLGEPDLLAPGV 
OClOTPPrNVFLLPCPNLDVYGECKLOlTHENiyLKDIHNPRVK 
LV* i'." PLC? L R R YG R DA TR FTFE AGR MCDAGEG L Y TFOTQEG BQ I 
YC'I VKSATLA 1 AEQKKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
RSAYVJKlUTCStfNIAF^SSYAGEGYG/AOASSETDLUJRFI 
LI i.FKPSCGDSSEAXTPSO 


~~ 6S12 


15< 


80? 


K: r. KSTWFP LSRS LRV AS GRSC KLGHGGYTG SG PGFG E PRDSGA 
EV:- < GSGRATGCERGGVRGARQGRAPGSS 3 WR KEPRMVCTRKTK 

1 Lm I V_ V Jl ,L>o v>rl J JM.1 J v- -L) L J I V vy r» V 1 rt 2 1 V I v nuyc/'ArL'iMMi 

Fi i:kgdti.ki ierldhlenvikohiqeapakpeeaeaepftdss 

LFA} 1 WGQFLS PEGRRVALKQFCYYGYNAYLSDRLPLDRP 


6513 




7£6 


FVT i ■■ E PGF S LA0LNL1 WOLTDT KQLVHSFAEGQDQGSA Y ANRTA 
Li- ID LLAOGNASLRLQRVRVADEGSFTCFVS IRDKGSAAVSLQV 
AMY SKPSMTLEPNKDLR PGDTVT1 TCSSYQGYPEAEVFWQDGQ 
GV> LTGNVTTSOMAJ^EOGLFDVKSILRVVLGANGTYSCL-VRNPV 
LCO: AUSSVTI7P0RSPTGAVEVQVPEDPWALVGTDATLRCSF 
SP! r'GKS LAO 1 >NL 1 WQLTDTKQLVHS FAEGQDQGS AYANR7ALF 
PCLLA0GNASLRLQRVRVADEGSFTCFVS1RDFGSAAVSLQVAA 

PYSr psmtlepnkdlrpgdtvtitcssyogypeaevfwqdgogv 

PLT: : rm*TSO.M/U*EQ^LFDVT3SILRWl^^ 
ODa. ; :SSVT3TPORSPTGAVEV0VPEDPWALVGTDATLRCSFSP 
EPC-7 5LA0LNLIWQLTDTR0LVHSFTEGR 


6514 


9B5 


302 


VG i 3 ) • G P T I S S AA EMKDLLDLOEELR YS LATSRAKMGRRAQQESA 
QArvHLNGKNPSLTLTGETSSAKLPRCRC^WAGDSVKASXFRR 
KASV E3 EDFRLRPQSLNGSDYGGDI PI I PDLEEVQEEDFVI.QVA 
APK 1 QI XR W5TYRJDLDNDLMKYSA3 QTLDGE I DLKLLTKVLAP 
2HEVKERNPSKODDVGWDV3DI{LFTEVSSSVLTEWDPLQTEKEDP 
AGC^.RHT 


6515 


1345 


305 


GRVL r; R R RG AA.V PGGCGAGSTQLE VSAS ASCGALGS ADMNP I W 
\THC ; 'v C-AGP3 S'KDRKERVKQGMVRAATVGYGILREGGSAVDAVEG 
A WA ]_,E DDPE FNAG CGSVliNTNGEVEM DAS I MDG KDLS AG AVS A 
VOCiANPI KIARLVMEKTPHCFLTDOGAAQFAAAMGVPEI PGEK 
LVTL^NKKJRLEKEKHEKGAQKTDCOKNLGTVGAVALDCKGNVAY 
ATSTT- GI VN KMVGRVGDS PCLGAGGYADNDI GAVSTTGHGES I h 
KVN^ARLTuFH I EOGKTVEEAADLSLG YMKSRVKGLGGLI WSK 
TGD^AKWTSTSMPWAAAKPGXLHFGl DPDDTT3 TDLP 


6516 


1 


1402 


FRPL?<YLGODATAAARI)LRTRG3vOGYCPSATAR0QVLVSAi-QQL 
XG^^S EHRNENOBMPYSTNKEL1 LG3MVGTAGI SLLLLWYHKVR 
XPG3 /iMKLPEFX^SLGHTFNS I TLQDE I IIDDQGTTVl FQERQLQ I 
LEKJ^ELLTKYiEELKEEIRFLKEAIPKLEEYIODELQGKITVHK 
3S?CHRARKFRL>PTIQSSATSNSSEEAESEGGYITANTDTEEOS 
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SFt» 
J L 

NC : 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
tc first 
am;no acid 
residue of 
amino acid 
sequence 


Piecicted end 
nuc •'] cot icr- 
locat ion 
cor: tsponc;nc; 
to first 
amino acic- 
resicue oi 
amine acid 
sequence 


Amino ac^d secment containing sicnal peptide 
(A-A!ianine, (^Cysteine, D=Aspert ic Acid, F.s 
GJutamic Acid, F-Phenyl r.I anine , G=Glycine . 
H^Hist idine. Ulsol euca ne , K=Lys:Lne, 
L=Leucine, ^Methionine , N^Asparagine, 
P=Proline, Q=Glutanine, R-Argin:nc, 
S=Serine, T=Tnreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«UnRr.own, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion! 









FPVPK^FNTRVEELNLDVLLOKVCHLRMSESC-KSESFELLRDHK 
EXFRDE I EFMWRFW*AYGDMYELSTNTQEK)OJYAN IGKTLSERA 
I NRAPMMGHCHLVJ YAVX.CG Y VSEFEGLQNK 1 NYGRLFKEHLDI A 
iK^LPEKPFLYYLKGRVCYTVSiaSWlEXKWAP.TLFGKIFSSTV 
OEM^INPLKAEELCPGYSNPNYMYLAKCYTDbEENQNALKFCHb 
ALLLPTVTKEDKEA0KEKOKIMTSLKR 


bs: ? 


3 


143< 


GRWGGTsSSU«AMV7VRGHAEbYERW0RQGARGWDYAHCLPYFR 
KA0GHELGASRYRGADGPLRVSKGKTKHPLHCAFI,EAT0OAGYP 
bTEDMNGFQOEGFGWMDMTlHEGKHV.'SAACAYLKPALSRTNLKA 
E AETLVS RVLiFEGTRAVGVE Y VKNGOSHRAY ASKE V I LSGGA I N 
S PQLLMLSG 1 GNADDLK ICLG 1 P WCH LPGVGONLODHLEI Y 3 QQ 
ACTRPITLHSAQKPLRXVCIGLEWI WKFTGEGA7AHLETGGFIR 
SQPGVPHPDI QFHFLPSQV1 DHGRVPTQQEAYOVHVGPMRGTSV 
G WLKLRS ANPQDH FV I QPNYLSTETD 2 EDFR LCVK LTREI FAOE 
ALAP FRG KELQPG5K J QSDKE 1 DAFVKAKADSAYJ1 PSCTCKMGQ 
PSDPTAWDPQTR VLGVENLR WDA-S 3 MPSMVSGNLNAPTIM3 A 
EKAAEtf J KGQPALWDKDVPVYKPRTXATQR 


6!S3fc 


242 


I09f 


PAWNPGf EPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
R I IRCR RRA P P P PSTMGDAGS ER S KAPS LPPR CPCG FWGSS KTttN 
LCSKCFADFQKKQPDDDSAPSTSNSOSDIiFSEETTS DNNNTS1T 
TPTLSPSOQPLPTEI.NVTSPSKEECGPCTDT/^HVSLITPTKRSC 
GTDSQSEKEASPVKKPRLLENTERSEETSRSKGKSRRRCFOCCT 
KlXLVOOELGSCRCGYVFCMLHRLPEOHDCTFDHviGRGREEAIM 
KMVKIORKVGRSCQR3 GEGCE 


6S1S> 


3 


1113 


ERKMAnPPSPVHCVAJJUVPTATVSEKEPFGKLOl'SSKDPPGSLf. 
AK KVR TEEKKAPR R VNCEGGSGGNS RQLQPP AA PG PQS YG$ PAS 
WSFAPLSAAPSPSSEKSSFSFSAGTAVPSSASASLSQPGPRKU, 
V P PTl^HAOPHHbLi PAAAAAAS AN AKSRRPKE KR E KE RR RHGL 
GGAREAGGASREENGEVKPLPRDK3 XDKlKEKtiKEKEREKKKHK 
VMNE Z KKENGEVKI LLXSGKEKPKTN I EDLQ3 KKVKKKKKKKHK 
ENEKR KR P KMY S K.S 1 CT 3 CSGL.LTDV E DOAAKG ILMDN1 XDYVG 
KNLDTKNYDSXIPENSEFPFVSLKEPRVONNLKJ^LDTLBFKOLI 
H2 EHOPNGGASVIHCXQ 


6 52 0 


3 




tRKhiAEPPSPVHCVAAAAPTATVSEKEPr-GKLQLSSRDPPGSLS 
AKJ<VRTEEKKAPRRWGEGGSGGNSROI>QPPAAPSPOSYGSPAJS 
WSFAPLSAAFSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLb . 
VPPTLLHAOPHHLLLFAAAAAASAN AK SRRPKE K R E KERRRHGL 
GGAREAGGASREENGEVKPLPRDK7 KDXI KERPKEKEREKKKHK 
VMNEI KKENGEVKI LLKSGKEKPKTNI EDLQ1.KKVKKKKKKXHK 
ENEKRKRPXMYSKS 3 QTI CSGLLTDVEDQAAKG: LNDNIKDYVG 
KNIjOT KNYDSX I PENS EFP FVS LK E P R VQNNLKR LDTLEFKQLI 
HI EHOPNGGASVIHCLQ 


652] 


184 


1796 


KLFKKATDTSOGELV1 J PKALPL I VGAQL I HADKLGEK VEDSTM P 
I RHTVNSTRETPPKSKUAEGEEEKFEPDI SSEES VSTVEEQENE 
TPPATSSEAEOPKGEPENEEKEENKSSEETXKXEKDOSKEKEKK 
VKKTI PSWATLSASQ1»ARA0K0TPMA55PR?KMDA J LTEAI JCAC 
FQKSGASWAIRKYI 1 HiCYPSLELERRGYLLKQALKREliNRGVJ 
KQVKGKGASGS FVWOKSRKTPQKS RJ1RKNRS S AVDP EPQVKLE 
DVLPIAFTRLCEPKEASYSLIRKYVSOYYPKLRVDIRPQLLKKA 
LQRAVERGOLEQI TG KGASGTFOLKKSGEKPLI>GG£ LME YA1 LS 
AIAAMNEPKTCSTTALKKYVLENHPGTNSHYOMHI.LKKTLQKCE 
KHGWMEQ3SGKGFSGTF0I-.CFPYYPSPGVLFPKKEPBDSRDEDE 
DEDESSEEDSEDEEPPPKRRLOKKTPAKSPGKA^SVKQRGSKPA 
PKVSAAQRGKARPLPK KAP F KAXTPAKKTR PS S TV 1 KKPSGGSS 
KKPATSARKE 


6522 


1042 


391 


NKWLRPSPRSHRTPESGRVtSLFRLFPPGMALSGSTPAPCWEED J 
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SEO I 
IV 1 
NO: 


Predicted 

becinninc 

nucleotide 

location 

cor responding 

to first 

amino ecid 

residue of 

amino acid 

sequence 


Predic t ed end 
nurl e t 1 1 de 
I cca t 3 on 
correrponding 
to first 
amino rcid 
residue of 
amino acid 
sequerice 


Kn.na acid scctr.eat containing signal peptide 
' A = A J am ne > C= c vEtfii ri6 , As pa r 1 1 c Ac id, E — 
G3. or. a mi c Acid, , l =Fhenyl alanine , G~Glycinc, 
}i = -listidine, i - : soleucine , K=by;;ine, 
L = Leucine, M=Mt Jiion;. ne . N^Asparagine, 
P=Froline, Q=G} amine, R-Arginine. 
S-Sirme, T~Tk: t.onme , V=Valine, 
Wr.l ryptophan, V- Tyrosine, X=Unknovn, * = Stop 
Ccdon, /-pousibie nucleotide deletion, 
\-p:issible nucleotide insertion) 








!7fi rwvrMi ci lict^i ; ' "v/^/rT^r^i refn r*i i t riCim/^ a kr^r*! 1 
^,L.Lui .v.-rj LibLiilKr. r L. V VLj^j^Li J L.V. tl^ULiAr iji-»L/c»Aytj>\>Ujt»l> 

5IR^^SGLKbLbELEKKG0CDKSNbRLLGQLLRVLAKHDLLPKLA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRROSSSSANSQQGSP 
PTKFORRSRGRPSGCARRRRRGPCPHPSSSQSPPDbPbKAK 


6523 




1097 


ASCOT R RR TAALDSC E S I AGRRSF I ALAMA£NF"NDI VKOGYVKI 
NSRr^GIFRRCWLVF>:K.AS£KGPRRl,BKFPDEKAAYPRNrHKVT 
EI.KNI KNI TRLPRETKKHAVA3 I FHEETSKT FACES ELEAEEWC 
KH bCM E CLGTR LND 1 5 L.GEPDbbAAGVQREQNERFTWbMPTPN 
LD I YGECTMQI THEN I Y LWDI HNAK VKLVMWPbSSLRRYGRDST 
VJFTFESGRM CDTGEO 1 ■ F TF QTR EG E M 3 7 QK VH S ATLAl AEQH ER 

l^illmeqkari.qtsltcpmtl.sksislprsaywhhltronsvge 
]ys:ognhenrh5jdlvgkscktse:krfleenaplvmygithhlf 

MDTSTCKWHDLE 


6524 


2 


1097 


ASCCTRRRTAAbDSGER I AGRRS F I AbAMASN FND1 VKQGYVK1 
P ? R K L>3 1 FR RCWI.V F K K AS S KG P RRl iEK FP DE KAAY FRN r HKVT 
F.I.HKI KNITRLPRt'TKK>lAVA3 1 FHDETSKTrACESELEAEEKC 
KJHiCVxECLGTRIiNDU^GBPDLLAAGVOREONERFNVYLMPTPN 

ldi ygkctmqithen: vlwdihnakvkl\^wpi,sslprygrdst 

WF7FESGRMCDTGEG1.FTFQTREGEM3YQKVHSATLA1AE0HER 
l.Wl.EKEOKARL<?TSL-TrPMTL£?KSISLPRSAywKHITRONSVGE 
I V S 1-0GNJ I E N RH SD 1 *G K S C KT S t) KR F L EE N A P b VM YG I THHLF 
KDTSTCKWHDbE 


6525 






GFSFFSEEESIEFNPiS SGRSAR7VSSNSFCSDDTGWPSSQSVS 
PVKTPSDAGNSPIGFGPGSDEGFTRKKCTIGMVGEGSIOSSRYK 
KFSKSGbVKPGSFAOrs-.SSSSTGSISAPEVHMSTAGSKRSSSSR 
K RG PRC R SNGAS SH K P C S S P S S P R E XDbbS MbCRNQbS P VN I HP 
£ Y A Pi : 5? P£ S SRS GSY KG 5? DCS P 1 KR R SGR YMS CGENHGVR P PNF 
EC Y bTPbOQ KEVTVRH i ,KTKbKES ER RbHERES EI VEbKSQbAR 
KREDW1 EEECIIRVEAClAbKEARKEI KQbKQVl ETMRSSbADKD 
KG I Q K Y r*VD I N I QN K K 1 .E S bbQ S K EKAK SGS bR DEbCbD F P C DS 
r E KS LTbNP PbDTMA-DG bS L.EE0V TG EGADR ELbVGDS J ANSTD 
LFDEI VTATTTESGCLr.bVHSTPGANVbELbPI VMGQEEGSVW 
ERAVOTDWPYSPA3Sli,3QSVbOXbCDPCPSSbASPDESEPDS 
MES FFESLS ALWDbTI RNPNSAI L LS P VET P Y ANVDAE VHANR 
LM R E ID FAACVEFR LI>G V I PbARGG WRQY WS S S FbVDLbAVAA 
PWPTVbWAFSTORGG'J DPVYNj GALLRGCCWALHSLRRTAFR 
1KT 


6526 




2(Ti4 


SC-RAGEPEEWRGRQI JLSKETWl PrNSEDSQQbEEAYSSGKGCN 
GR W FTDGGRYDVHbG ERMR YAVY WDEbASEVR R CTWFYKGDKD 
NKYVPY SES FSQVLEF.: Y MbAVTbDE WK KKbES PNREI 1 1 bHNP 
IM^rlVn i \^ Y v a»vj buuww; ,i i'ri^.v^r* rK l v wtov LiiiovuJ nw&f 
b03 DHbVFVVHG I GPACDbR FRS I VOCVNDFRS VSLNLLOTHFK 
KAQENQQIGRVSFbPWWKS PbJJSTGVDVDLQRI TbPS3NRbRH 
F7 ND T I LDVFF YNS PTY COT I VDTVA S SKNR I Y TLFLQRUPDFK 
G3V S 1 AGHSbGS b I bFC 3 bTNQKDSbGD I DS EKGSbNXVMDOGD 
TPTLEEDbKKLQLSEFFDI FEK^KVDKEALALCTDRDbQEIGI P 
LGPRKKTbNYFSTRKNSMGIKRPAPOPASGANIPKESEFCSSSN 
TRNGDYLDVGIGOVSVK Y PRbl YKPE I FFAFGS P IGMFLTVRGb 
KRlDFUYRFPTCKGFFNlYHPFDPVAYRIEPMWPGVEFEPMbl 
PK^KGKKRMHLEbREGLTRMSMDbKNNLbGSbRMAWKSFTRAPY 
PALOASETPEETEAEPESTSEKPSDWTEETSVAVKEEVbPINV 
GMLNGGQR I DYVbOEKF 2 ES FNEYbFALOSHbCYWESEDTVLbV 
bKEl YQTQGI FLDQPU 


6527 


1 


9i*<. 


GWVPLLSRIbPSDACKlYKQGIMIRbDTTblDFTDMKCQRGDbS 
FIFNGDAAPSESFVVbD?JEOKVY0RIHKEESEMETEEEVDJLMS 
SDIYSATbSTKSISFTR/'.OTGWbFREDKTERVGNFbADFYbWG 
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11; 

NO: 


Predicted 
beginning 
nucleoti de 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Hecacted end 
nuciect :de 
lr.cztic:\ 
corresponding 
to first 
B-rano acid 
residue of 
cxmir.o acid 
sequence 


Amino acid s?cxer. t containing £ayna] pept ice ] 
(A=Alanine, C- Cysteine, D=Aspart:c Acid, E- 1 
Glutamic Acid, ?- Phenylalanine , G-G]ycinc, 
H^Hi stidine , 1 * j rele jcine , K^Ly«:ne, ] 
L^Leucine, {^Methionine, N = As paragine , 
P=Proline, O-Giut amine , R^Arcini ne , 
S-Serine, T= Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine. X=UnHnown, *«Stop 
Codon, /=possihlc- nucleotide deletion, 
\-possible nucleotide insertion) 








LVLESRKRREHLSEEDli.RKKAIMESLSKGC-NlMEONFnpl RRC ' 
SLTPPPONTirW^EVlSAENGKAPHLGRELVCKESfCKTFKATIA 
MSQEFPLGI El»LIjNVLFA r VAPFK}IFNKbREFVOMKLF FGFP VKL> 
DI PVFPTITATVTF0EFRYPSFLX3S I FTI PDDYKEDPSRFPDL, 


f 528 


1 


10/3 


LTG P AAAE PRC A AL)AG M KRALG R R KG VWLRLR K I LFCVLGI.Y1A , 
1 PFLI KLCPG1 QAKLI FLNFVRVPYFJDLKKPCDQGLNHTCNYY 
l^PEEP\rriGVWHTVPAVMW-KNAOGKDOMWYEDAlA££HPl 1LY 1 
LHGNAGTRGGDHRVEbY KVLS^LGYHWTITJYP.GWGDSVGTPSE 
RGMTYDALHVFDWI KAKSGDNPVY3 WGHSLGTGVATNLVRRLCE 
RETPPDALI LESPFTN1 REEAKSHPFSVIYRYFPGFDVJFFLDPI 
TSSG I KFAN DENVKH I -S C PLL I LHAEDD P W P FQLGR KLYS I AA 
PARSFKDFKV0FVFFHSDLGYRHKY1 YKSPELPRILREFLGKSE 
PEHQH 


6529 

j 


363 


2215 


THJRYlaKIGWKTMSCGNEFVETLKKIGYPK/vPNLNGEDFDWLF 
EGVEDESFLKWFCGNVNEQNVLSERELEAFSJLQKSGKPJLEGA 
ALDEALKTCKTS DL KTP R LDD X ELE KLE DE VG/TLbKL KK I ,K I OR 
RNKCOLMASVTSHKSLRLNAKEEEATKK^KOSOGILNAMITKIS 
NELOALTDEVTOUWFRHSNLGOGTHPbVFbSOFSL>EKYLSOE 
EQSTAALTLYTKKOFFQG 3 HEVVESSNF.SOKFNFbKl QTPS 1 CD 
NQE3 LEERRLEMARLOI -AY 1 CAOHQL.IHLKASNS SMKSSI KWAE 
LSLHS LTSKAVD K ENLPA K I SS LTSE I M KLEKE VTQ I KDRS 1>PA 
V VR ENAO l.LKMPWKGDFDLOl AKQDYY TARQE L VLNQL JKQKA 
SFELbQbSYEJELRKHRDJYROLENLVQELSOSNMMLyKOLEML 
TDPSVS 001 N PR NT 1 DTKDYS THRLYQ VbEGENK K KELFLTHGN 
LEEVAEKLKCNlSbVOPOLAVSAOEHSFFbSKRNKDVDy.bCDTb 
YOGGNObljbSDOELTEOFHKVESOLNKLNKLLTDILADVKTKRK 
'i LANNKLHCMSREFYVYFbKDEDYXKDl VSNLETOSK ) KAVSLX 
D 


6520 

j 


128 


298£ 


GAAUHGAI VQVH PLLPGS ST1 MI HDLCLVFPAPAKAVVY VSDI Q 
ELYIRWDKVEIGKTVK^.YVRVLDLHKKPFLAKYFPFKDLKLRA 

AS P 1 1 TLVAL.DE aldny ti tfi.i rgvai gots ltas vtnkagqr 

: NS APOO IEVFPPFRLMP RKVTLL1GATMQVT55 EGG PO PQSN I L 
FSI SNESVALVSAAGLVC'GLA j GNGTVSGLVOAVDAETGJCv'Vl I 
5 ODLVQVEVLLLRAVR 3 RAP I MRMRTGTQtf PI YVTG I TNHQN PF 
.SF3NAVPGLTFHWSVTKRUVLDLRGRHHEAS1RI.PS0YXFAMNV 

LGRVKGRrGLP^V^KAVt)PT£G0LYGLARE^SDE10VC>VFEKLO 
^LNPEIEAEQILMSPNSYIKLOTNRDGAASLSYPVLDGPEKVPV 
\mVDEKGFLASGSMl GTST3 EV 1 A0EPFGANQT3 3 VAV KVG PVS 
yLRVSMS?VLHTONKJEALVAVP:,GOTVTFTVHFHDNSGDVFIlAH 
S S VLNFATNRDD FVQ I G KG PTNWTCWRTVS VGLTLLR VWDA KH 
PGLSDFMPLPVLOAI S PELSGAT'IVVGDVLCLATVLTSl SGLSGT 
V?SSSANSTLHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEVW 
£VPQRlMAR>rLHP3QTSFOEATASKV3VAVGDRSSNLRGECTPT 
QREVIOALHPEJL-SCObC'f KPAvrbr Fh\}uyr J V Lr\jtl) lAiA* 
QYFCSITWHRLTDKORKHLSMKKTALWSASLSSSHFSTECVGA 
EVPFSPGLFADOAEII.LSNHYTSSEIRVFGAPEVLENLEVKSGS 
FAVLAFAKEKSFCWPSFITYTVGVLDPAAGSOGPbSTTLTFSSP 
VTNQA1 Al P VTV AF WDRR GPG PY G ASLFQHFLDS YQ VKF FTL F 
ALLAGTAVM J IAYHTVCTPRDLAVPAALTPRASPGHS PHY F AAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6531 


84 5" 


1425 


PSAS1PPSASPDPVPD1RTCHFCLVEDPSVGCISGSEKCTISSS 
? LCMV1 T I Y YDVKVR FJ VRGCGQ Y IS YRCQE KRNT YFAE YWYQA 
OCCOYDYCNSWSSPQbQSSLPEPKDRPLALPLSDSQIOWFYOAL 
tfLSLPLPNFHAGTEPDGLDPMVTLSLNLGLSFAELRRMYLFLNS 
S GLL VLP QAGLLTP K PS 


6532 


2 


954 


A7*GP PS E WNODS L FPE PE PG PAPO VLLGPOG PGL J KG VAPFTL 
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SEQ 


Predicted 


Predicted end 


Amino cicic st-c-TTien l Containing f " onaj, pt r^Lid^ 1 




beg i nning 


nucleot ide 


(A.^A 3 a ni ne , C^Cvsicine, D«Aoport : c Acio, F,= 


NO : 


nucl eot ide 


i oca t ^ on 


Glutamic Ac i ci , F~ Phenyl al anine , G-Glyc2i;t_, 






coi r espondi j;tj 


H = Hds>tiGine, I = j so]&ucinC| K*Ly i- z n ti , 








L= Leu cine , M=Met hi oni ne , N=A£pr:2 <■. uinc- 




to firet 


amino acid 


P=Pro]ine, Q=G}u t^mine, R-Argim nc , 




amino acid 


residue of 


S-Serine, T- Threonine, V^Valint , 




residue ot 


?>mino acic 


WsTryptophan, Y-Tyrosine, X*UnV.nown f **stcp 




amino acid * 


sequence 


Ccdon, / = pcssib}e nucleotide ct jet ion. 




sequence 




\=posfiible nucleotide insertion; 








ITDSTGTHLVLTVTNKNAKSPGI.SRGSPQO? t £ QPGSFAPAPSA 








QMDLEH P LQPLFG TPT S I jLKK EP PGYEEAMSOC P KQQENG S S SQ 








OMDDLFDlLlOSGElEADFKSPFSLPGKEKPiKKTVCWEPlAAO 








PSPSAELPQMPPPPGSPSLPGRLEDFLESSTGLPLLTSGKDGP 








EPLSI^lDDJUHSOMLSSTAILDHPPSPMDTSH^HFVPEPSSTKiGL 
















LOLHWDSCL 


6533 


179B 


373 


STISWlARVEPPRRSSGVCAARLRFPGGSRFLRARACVLAliAVL 
ALL ERNNADSMSAHvSMbC E R I AIAKELI KRAF SLSR SR KGG I EG 
GAKLC SKLKAELK FLQKV EAGKVAI XESHL0S7NLTHLRA1 VES 
AENLfc-kV v£> VLH V r bilLi JL»k>tKL i JL»V VJJV w» *(.?L>H1 « v rvAItaK 
KAEALHjaiWLGRGOyGDKPJJEOAEDFLQASKCOPVOyf NPHII 
FAFYHSVSSPMAEKLKEMC I SVRGD3VAVNAL LDHPEELOPSES 
ESDt'EGPELLOVTRVDRENI 1 ASVAFPTEI KYDVCKRVNLBITT 
LITYVSALSYGGCHF1FKEKVLTEQAEQERKE0VLPQLEAFKKD 
KELFACESAVKDFOSILDTLGGPGEKERATVH • XR INWPDQPS 
ERAL R i >V AS SKINSRSLTIhG TG DTLKA ITKl/'-NSG F VKAmNNQ 
GVKFSVFIHQPH AL.TESKEALATPLPKDYTI LU."EK 


6534 


47 


596 


KATRFJSAAFVVLNXOGVfJFAKLPHTSWSWSLCTLSFLFSGDLA 
EKSLOCFPCSAMLLELIPLi.GIHFVLRTARAUiVTOPDliiirVS 
ECASLELKCNYSYGATPYLFWMERTVEEAFI LLVCLKPWRVASS 
LEKK F. K EDES FOLLLG SR Y NV LKAHCLLP L I R V\ 1 .TSGDS 1 .LS AO 
PHCPQGL 


6535 


250 


964 


L1KTFFNDVAIQRDL.LPKEKNLETLLTLAF1 F.1 DKAFSSKARLS 
ADATLL. TSGTTATVALLRDG I ELWASVGDi VJ- l LCRKGKPMKL 
TIDHTPERKDEKER3KKCGGFVAWNSLGQPHVNGP.LAMTRSIGD 
LDLKTSGVXAEPETKJllKUIllADDSFLVLTTDGJ NFMVNSOEI W 
D FVNOCHD PNEAAHA VTEO A 1 QYGTEDNSTAWY P FGAWGKY KN 
SEINFSF5RSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYG1LFK0 EOAH D DA I W S V AWG T " K KENS ETWTG S 
LDDIjVKVW KWRL'dRLDIjO^.SLLGHULGV Vb vi: I >: 1 J_»Pi M-Sbb 
LDAHI RI.WDLENGKQIKS1 DAGPVDAWTLAFFJ'PSQYLATGTKV 
GKVN] FGVESGKKEYSLDTRGXFILSIAYSPDC-XYliASGAl DG1 
INI FD1 ATGKLLHTLEGHAMPIRSLTFSPDSCL-VTASDrjGYI K 

iydvqfl^lagtlsghaswvlnvafcpddth: VSSSSDKFVK\^W 

OVGTRTCVUTFFDHQDQWGVKYNGNGSKI VfA'f-DDOEI H3 YDC 
PI 


6537 


1638 


921 


tfRFNPF ptqgpdpslvyrpdvdpevakdxas k vtsgplI.drv 

VTT Y K L MH THOTVD FVR S KKAO FGG FS Y KKMTVK E A VDLLDG L V 
DESDPDVDFPNSFHAFOTAEGlRKAHPDKDWFi-'LVGLLHDLGKV 

lalfx^powawgdtfpvgcrpoaswfcd^^k'oekpdlodpry 
stel^kyqphcgldrvlmswghdgearggqwggggrwgtvgggg 
aeavpagdtlspqstctr 


6538 


3345 


2412 


PYLYDFLDALITCOTAPEFJiFIKLDGLAGMLTEOLRRLTKOVOE 
ARHNRDDEAI KKAVNEYDETMEKY1 PVLMAOAK I YWNLENYPMV 
BK I F S KS VE FChnDPrD WKLNVAHVLFMQEN K V X HA3GFYE PI VK 
KHVDN7 LNVSAI VLANLCVS Y JMTSQNEKAEEL?<KKI EKEEEQI* 
SYDDPNRKMYHLC3 WLVI GTLYCAKGNYKFG3 f RVI KSLEPYN 
KKLGTDTWY YAXRCFLSLLE?CMSKHMIVIHDSV J OECVQFLGHC 
ELYGTN I PAVI EQPLEEERMH VGKNTVTDESR OLKAL I YE 1 1 G W 
NK 


6539 




339 


FLGAAS PH PHFS SLAPHPDOPEFTPVQDELEA T <i.'LWGPGV 


6540 


3 


391 


LER LWLL LLPvR P EDAMAECP TLGE AVTDHPDR L >/ A WE K FV~/ LDE 
KQHAKLFLTIElNDRliOLRVLLRREDVVIiGRFKTPTOIGPSLLP 
IMWnLYFDGRYRSSDSSFMRLVYHlKIDGVEDMLLELLPDC 
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SEO ! 


Predicted 


Predicted end (Amino acic segment containing signal peptide 


ID j 


bee: nninc 


nucleotide 1 


(A=AlcUi: ne , OCysteine, D=7\sp^rr.ic Acid, E= 


NO : 


nuc3 eot i c< 


location 1 


Glutamic Acid, F-Phenylal am ne , G-G^ycine, 


! 


iocat ioi. 


corresponding j H-Hist id: ne , 1 = Is-oleucine , K=Lysine, 




cox r espondiny 


to first | 


L = Leuc:me, M=Met hiomne, KsAspai agi nt , 




to first 


amino acid 


P=Prol:r.e, Q=Glutaroine, R-Arcinine, 




amino acid 


residue of j S=Serint, T=Threonine, V=Valm<:, 




residue cl 


amino acid { W=Tryptophsn , Y^Tyrosine , X=Unknown, *=Stop 




amino acic 


sequence 


Codon, /-pcooible nucleotide deletion, 




sequence 




\=possibic nucleotide insertion) 




1165 


536 


RTLVQRR i :..MLLRKPARGRDLRGRGRGTPRGGRKGI,LPTPDEFP 
RFEGGRKFD5WDGNREPGPGKEHFRI3TPRFDHPPHDGHSPASRE 
RSSSLOGKDMASI.PPKKRPWHDGPGTSEHRCMKAPGGPSEDRGG 
KGRGGPGFmQRVPKSGRSSSLDGEHHDGVHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


2 


377S 


SWPRGRGtTGGHPGALHTRTMOKSVRYNEGHALYLLAFLARKEGT 
KRGrLSKKTAPA5RWHEKWFALyONVLFyFEGEOSCRPAGMYI*L 
KG C S CEK 7 ? A P PRAG AGQGG VR D ALP KQ Y Y F T V LFGK EGQK PI*E 
LRCEEEODGKEWMEAIHQASYAD1LIEREV1.M0KV1HLVQIVET 
EK1AANO1.KH0LED0DTE1ERLKSEIIALNKTKERMRPYQSNQE 
DEDPD1 KK I KKVQSFMRGWLCRRKWKTJ VODYICSPHAESMRKR 
NO^VFTMVCAESEYVHQLYILWGFLRPLRKiAASSKKPPISHDD 
VS S I FLN F E T I M FLH E 1 FHCG LK A R 3 AN W PT1 • I L>ALL FD I LLPM 
LN I YQEFVKNHQ YSLQ VLANCKQNR DFD K LL KQYEAN PACEGRM 
LETFLTY Pt'.FQI PRY I lTLHELLAHTPHEHVERKSLEFAKSKL.fi 
ELSRVMHDFVSDTF.N1 RKNLA3 ERMI VEGCD I LLDTSQTFIRQG 
SLIOVFSVrRGKiSKVRLGSLSLKKEGKROCFLFTKHFLICTRS 
SGG KLHLLKrGGVLSLIDCTMEEPDASDDDSKGSGOVFGHLDF 
KIWKPPDRAAFTWLUVPSRQEKAAWMSDI SQCVDNI RCNGLM 
T1VFEENF KV1VPHMI KSDARLHKDDTD1CFSKTLNSCKVPQIR 
YASVERLLER LTDLRFLS IDFLNTFLHTYKI FTTAAWLGKLSD 
I YKRP FT5 J PV RSLEL FFATSCNNRGEKLVDG KS PR LCR K FSS P 
PPLAVSRTSSPVRAJIKLSLTSPbNSKTGALDLTTSSSPTTTTQS 
PAASPPPK7GQIPLDLSRGLSSPEQSPGTVEENVDNPRVDLCNX 
LKRS I OKA Vl.ESAPADRAGVESS PAADTTELS PCRSPSTPRHLR 

yrqpggotaj;nahcsvspasafaiataaaghgsppgfn t ntertc 
dkjefiirrt^-i'nrvlnvlrhwvskkaqdfeltwelkmnvlnlle 
e vlrdpdl l pq e rkaa?in i lmals qddqdd 1 hbkledl iqmtdc 

MKAECr ESLSAMFLAEQi TLLDHVI FRSI Pit.hr lA>UuwriAljlJfv 
tfERTP Y I M KTSQKFNDMS NLVASQ I MWYADVS SRANA I EK WVAV 
AD I CR CLKK YNGVX£3 TSALNR SA3YR LKXTWAKVS KQTKALMD 
KIjQKTVSSEGRFKNLRETLK^JCNFPAVPYIjGMY LTDLAF3EEGT 
PN FTE EGL Vft FS KMRMISHIIREIRQ FOOTS Y R I DHC P KVAQ YL* 
LDKDLI IDEUTLYELSLKIEPRLFA 


6543 


185"/ 


950 


FVSGCGRAG 3 G LSWAMAAEARVSR WY FGG LAS CG AACCTH P L»DL» 
LKVHLOTCOEVK LRMTGMALRWRTDG 1 LALYSGLSASLCRQMT 
YSLTR FAI YET VRDRVAKGSOGPLPFHEXVLLGS VSGLAGGFVG 
TPADLVlWRNQNDVKLPQGQRRWAHALDGI.YRVAREEGLRRLF 
SGATMASSRGALVTVGOLSCYDOAKOLVLSTGYLSDNIFTHFVA 
SFIAGGCATFLCQPLDVLXTRLMNSKGEYQGVFHCAVETAKLGP 
LAFYKGLVPAG I RmIPHTVLTFVFLEOLRKMFGI KVPF 


6S44 


630 

1 
1 

1 


79 


pspcfirsr3,ixj0?w^.gleawls0nfslhcp0srvrvrjrasis 
epsdtdpeprtlnpspagwfvqohpblelmssfrerfgrnwlqy 
rshlepsgnplpatpttsapsappassqgpdtaprpsppceear 
gpqespokmseevraepceeeeekegkeekeegemaplpeahlg ^ 

egkqkecp 


6545 


17b 


560 


F PHSHAALL PAAMTPLLTLI h WLMGLPLAOALDCHVCAYNGDN 
CFNPMRCPAJWAYCMTTRTYYTPTRKKVS KS CVPRCFETVYDGY 
S KHASTTS C CO YDLCNGTG LAT PATLALAP I LLATL WGLL 


6546 


1657 

i 

i 


364 


HLL^lJDEVAAFFVADliGAIVRiOjFCFLKCLPRVRPFYAVKCNS 
SPGVLKVLAOIX3LGFSCANKAEMELVQHIGI PASK3 1 CANPCXQ 
3AOIKYAAKHGIOLLSFDNEKELAKWKSHPSAKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHHVEWGVSFH3GSGCPD 
P0AYA0SIADARLVFEMGTEIXH)O^HVLDLGGGFPGTEGAKVRF 
EEI ASV 3 K S ALDLY FP SGCGVD I FAELGRY Y VTSAFTVAVSI I A 
KKEVLLDOPGREEENGSTSKTIVYHLDEGVYGIFNSVLFDNICP 
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ID 
NO: 


Predictpc 
beginning 
nucl eot. 5 dt 
locat icr. 
corresponding 
to first 
amino acid 

JT6 S 1 QU6 Oi 

amino acici 
sequence 


Predicted cna 
nucl eotide 
1 ocot ion 
cojrre spond i ng 
to first 
sniino acid 
residue of 

cLTTlZ.nO 3C1Q ' 

sequence 


Amino acid segment containing s;</r.al peptide 
(Adenine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Pbenyl alanine, G=Glycine, 
H=Hie t idi ne, 2 =lscl eucine , K^Lys ~» ne , 
LsL>n\!c:ne , M-Methionine , K^Aspar ; ^ine , 
P= Pro i me , Q=Glu tannine, R-Argini:.e, 
S=Serine, T= Threonine, V-Valine 
w- i tyj->iupfid[i/ i = i y xr ut> i ne , a = un k . : o wri , - s t op 
Codon, /-possible nucleotide delation, 
\-possible nucleotide insertion: 








TPILOKXPSTBOPLYSSSLWGPAVJCJCDCVAivtLKbPOLHVGDW 
LVFDNKGA Y TVGMGSP FWGTQACH I TYAMS R VA WEALR R 01- MAA 
ECEDDVKGVCKPLSCGWEITDTLCVGPVFTPAriK 


6547 


j 


541 


LHSKYLAPALCSOPGMMRCCRRRCCCKOPPHALKPLLL.LPLVLb 
PPLAAAAAGPNRCDTI YQGFAECL I RLGDS MGR GGE L ET I CKS W 
KDFHAC ASQVLSGCPEEAAAVWESLCOEAROAK R PNNLHTLCGA 
PVHVRERGTGSBTNQETbRATAPALPKAPAPPI LAAA1*A1 AY Lb 
RPLA 


6548 




219 


FV S R LSVRDVRf PTFLGGHGADAMHTDPDY SAAYVP1 ETDAEDG 
1 KGCGI TFTLGKGTEVGELK1 LSRFQNA 


6 54 9 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQEEI 3 CARKHKL1 KMC 
SSVAAKLWFbTDRRIREDYPOKEILRALKAKGCTEELDFRAWM 
DEWLT1 EOGNLGLRINGELITAYPQVWVRVFI PWVQSDSDIT 
VLRHL,EKMGCRbMNRPOAILNC\ r NKFWTFOELAGHGVPLPDTFS 
YGGH2N FA KM I DEAEVLE FPMWKNTRGF. RG KA V FLARDKHHLA 
DLSHLIRHEAPYLFOKYVKESHGRDVRVIWGG^WGTKLRCST 
DGRMOSNCSLGGVGMMCSLSEOG}(OIt^I0VSN;] J GMDVCGIDl t L 
MKDDGS FCVCEANANVGPI AFDKACNLiDVAG 1 3 ADYAASLbPSG 
RLTRRM^LLSWSTASETSEPELGPPASTAVBWKSASSSSVDSD 
PESTERELLTKLPGGLFNWWLLANEIKLLVL 


65S0 


2?.? 3 


922 


FR V S R DG AP DCG 1 EQMG LAMEHGG S Y ARAGG S ? R GCW Y Y L R Y F F 
LFVSLICFLllLGLVLFMVTGNVHVSTFSKLOATERRAEGLYSQ 
LLGbTASOSNLTKELNFTTRAKDAIWMWLKARKDLDRlNASFR 
OC0GDRVJYTNNORYMAAIILSEKOCRDQFKDMNKSCDALLFML 
NOKVKTI.EVEIAKEKTICTKPKESVbbNKRVAEEOLVECVKTRE 
bOHgEROLAKEOLOKVQAbCLPLDKDKFEMDLRpXWRDS 1 1 ?RS 
L DNXG YNLYH PLGSELAS I RRACDHM PS LMSS K VEEI AR S LRAD 
IERVARFNSDLOROKLEAOOGLRASOFAKQKVEKEAOT^EAKLQ 
AECSROTGLALEEKAVbRKERDNUOCELEEKKRLAEOLRMELAT 
R NS A LDT C I KTKSQPMMPV S RPKG P V PN POP I DI : AS LEE FXRXX 
LESQRPPAGI PVAPSSG 




157 


746 


ICPPDPRj^KTLAAYKEKMKELPLVS1iFCSCFLAI ; PLNKSSYKYE 

aetvdlnwcv j sdmetvielnkctsgqsfevilk} psfdgvpefn 
aslprrrdpsleeic>kkleaa£;errkyoeaell):hlaekkeher 

EVJOKAI EENNNFI KmKEKLAOKMESNKXNREAHLAAMLERLQ 
EKTKHAEEVRKNKEL.KEEASR 


6552 


157 


748 


IQFPDPRNKTbAAYKEKWKELPLVSLFCSCFLAX PLNK5SYKYE 
ADTVDLN WCV I SDMEVIBLNKCTSGOSFEVILKf PSFDGVPEFN 
ASlPRRRDPSbEEIOKKLEAAEERRKYOEAELLKKLAEKREHER 
EV10KA1 EENNNFI KmKEKLAOKMESNXENREAHI>AAMLERlrQ 
EKDKHAEEVRXNKEbKEEASR 


6553 


2 


1807 


FVWS KI^HLSYGRVNLNVLREAVRRELREFLDKCAGSKAl VWD 
PYI.TCPF^T.IAOY^I.I .KFHEVEKMFTI.KGNRLP/ ADVKNI 1 FFV 
RPRLELMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
LGVLGSFIHREEYSLDLIPFDGDbLSMESEGAFKECYLEGDQTS 
L YHAA KGLMTLQALYGTI PQI FGKGE CARQVANMM 1 RMKR EFTG 
SONSI FPVFDNLLLLDRNVDLLTPLAT0LTYEGL1 DEI YG IONS 
YVKLPPEXFAPKKCX5DGGKDLPTEAKKLQLNSAEKLYAEIRDKN 
r.NAVGSVLSKKAKI I SAAFEERHNAKTVGEr KOFVSOLPHMQAA 
R G SLANH TS 1 AELI KDVTTS ED FFDKLTVEOE FKS G I DTDKVNN 
Y I EDCI AQKriSLIKVLRLVCLOSVCNSGLKQKVL DYYKREI LOT 
YG YEH 2 LTLHNLEKAGLLKPOTGGRNNY PTr R K7 LRLWMDDVNE 
ONPTDISYVYSGYAPLSVRLAQLLSRPGWRSIEEVLRIbPGPHF 
EEROPLPTGLOKKROPGENRVTLlFTLGGVTFAKlAALRFbSQL 
EDGGTE Y V I ATTKLMNGTSW 3 EALMEKPF 



52 J 



DNSDOCID: <WO 0153312A1_L> 
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I'C T/lJSOO/34263 



SEO 
ID 
NO: 


Predi cte<: 
beginning 
nucleotide 
location 
corresponding 
cc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleoli de- 
li oc at icn 
correspond a rig 
to first 
ernino ccic 
residue 
amino acid 
sequence 


AT.inc acid segment containing signal peptide 
(A-Alsni.ne, OCysteane, D=Aspartic Acid, E= 
Glutamic Acid, F=?her.y 1 alanine , G^Glycine, 
H-Histidine, 1 -Isoleucine, K-Lysine, 
L.-Leucine, M=Methiom ne , N=Aspa ragine , 
P = Prolane, Q^Glutanune , R«=Arginine, 
5 -Serine, T=Threonine, V^Valine, 
K-Tryptophan, Y=Tyrcsme, X---Unknown, *-Stcp 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6554 


11? 


1244 


fi.mgsovsvesgalhvvivgggfgglaaasoloalnvpfmlvdn 
jo)s ft:>^vaalras ve7g fakktf i s ysvtfkdnfrqglvvgi d 
l k^qmvllqgg ealp fs h l» i latg $ tg p fpg kfnevs sqq aai q 
ayedmvrqvqr^rfiwvgggsagvemaaeixteypekevtlih 
sovalatkellpsvkoevkeillrkgvolllservsnleelpln 
e v rey i kvqtdkgtevatntlv i lctg i ki nssayrkafesrlas 
lgaluvne1iu>veghsnvya1gdcadvrtpkmayiaglhan1av 
an:vnsvkorplqaykpgaltfli>smgr>3Dgvgoisgfyvgrxm 
vrltksrdlfvstswktmrosff 


6S5S 


1552 


496 


1 HKAI.LRK2 NOVLLFLbl VTLCV1 LYKKVl-KGTVPKNDADDESE 
TPEELEEE3 PVVICAAAGRMGATMAAINS1YSNTDAN1LFYVVG 
LR NTLTR I R KW I EHSKLRE INFKI VEFNPKG LKGK IRPDSSRPE 
LLOPLNFVRFYLPLLIHQHEKVIYLDDDVIVQGDIQELYDTTIaA 
LGHAAAFSDDCDLPSAODI NRLVGLQNTYMGYLDYRKKAJ KDLG 
JSPSTCSFKPGVIVANMTEWKHORlTKQljEKWrtOXNVEENl/YSS 
SLGGG VATS PML.I VFHGKYST2 NPLWHI RH LGWN PDARYSEHFL 
CL AKLbHWKGRHKPWD HPS V HNDLWESWFV PDPAGI FKLMHHS 


6556 


241 


i449 


ASLCKGCFFVTHVLV11I.PSL0SPPTFGFLLD1DGVLVRGHRV1 
PAALKAFRRLVNSCGQLRVPWFVTNAGNILQIISKAOEIjSAU^ 
CFVrADOVILSHSPMKLFCEYIJKKRMLVSGQGPVMENAQGLGFR 
NWTVDELRMAFPbLDKVDLERRLKTTPbPRNDFPRIEGVLLLG 
EFVRWETSLQ11 MDVLLSNG5 PGAGLATPPYPHLPVIASNMDLb 
WMAEAKMPR FGHGTFLLCLETI YQKVTGKELRYEGI.MGK PS I LT 
YCYAEDL1RRQAERRGWAAPIRKLYAVGDNPMSDVYGANLFHQY 
LCF^THDGAPELGAGGTRQOQPSASOSCISILVCTGVYNPRNPO 
S7 K PVLGGGEPF FHGHRDLCFS PGLMEASH WNDVNEAVOLVKR 
KEGWALE 


6S57 ' 


2598 


1534 


RMCGRTSCULPRDVuTRACAYODRRGQQRIjPEWRDPDKYCPSYN 
KSPOSNSPVLrLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
SKLOENTTNCRSDTVMEKRSFKVPLGKGRRCVVIiADGFYKWC'RC 
OGTN0ROPY Fl YFPOI KTEKSGS J GAADSPENWEKVWDNI-7RLLT 
rlAG IFDCWEPFEGGDVLYSYTI itvdsckglsdihhrmpaildg 
EEAVSKW1.DFGEVST0EALKLIKPTENITFHAVSSWKNSRWNT 
?ECI^PVDLWKKELRASGSSORMLQWl»ATKSPKKEDSKTPOKE 
ESDVP0WSSOFLOKSPLPTKRGTAGLLE0WLKREKEEEPVAKRP 
YSQ 


6558 


23 


113B 


rHGRRRGGRKMELGSCLEGGREAAEEEGEPEVKKRRLLCVEFAS 
VASCDAAVAOCFUAENDWEMERALNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDSTTSK I S PS EDTQQENGSMFS LI TWN I D 

GI .DLKNIjS F.RARGVCSYLALY s pdvi flqevi'ppy ys ylkkrs s 
JTYEI 1TGHEEGYFTA1MLKKSRVKLKS0E1I PFPSTKMMRNLLC 
WVNVSGNELCLMTSHLESTRGHAAERMNOLKMVLKKMQEAPES 
ATV I FAGDTN LRDREVTRCX3GL PiWI VDVWEFLGKPKHCQYTWD 

T»r\V'MOWT r , TT7\ftrvl D CT\0 T CPD h. ti El F Fi^U T T DUCT .nT Tr*T.TTVT. 

DCGRFPSDHWGLLCNLDIIL 


6559 


3 


364 


GPEL^GLPTRPKKLKANOTPIAMDCCASRSCSVPTGPATTICSS 
DKSCRCX5VCLPSTCPHTVWLLEPTCCDNCPPPCH3PQPCVPTCF 
LLrCSCOPTPGLETLNLTTFTOPCCEPCLPRGC 


6560 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GV2HTKMSLHGASGGHERSRDRRRSSDRSRDSSKERTESQLTPC 
I RK r ^TSPTROHHVEREKDHSSSR PS SPRPOKAS PNGSI S S AGNS 
S RNS SOS S SDGS CKTAGEWTFVY ENA KEGARN I RTS ER VTLI VD 
NTRFWDPS 1 FTAQPNTMLGRMFGSGREHNFTRPNEKGE YEVAE 
Gl GSTVFRAILDYYXTGI IRCPDGIS1PELREACDYLCISFEYS 
TIKCRDLSAXiWHELSNDGARROFEFYLEEKILPUMVASAQSGER 
ECHIWLTDDDWDWDEEYPPOMGEEYSQJIYSTKLYRFFKYIE 
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BNSDOCID: <WO_0l533t2A1_L> 



WO 01/53312 
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SEO 
ID 

NO: 


Predicted 
beginning 
n.uclecr ide 
locaticr. 
corresponding 
to firet 
amino ucid 
residue of 
amino c:Cid 
sequence 


Predacced end 
nucleotide 
locat ion 
corresponding 
to first 
amine ecic 
residue of 
amino ecic 
se out nee 


Amine acid segment containing signal peptide 
(Alanine, C = Cystei.ne, D-Aspartic Acid, E- 
Glutamic Acid, F-Phenylalcr.ins, G-Glycine, 
P>Histicline, I = Isoleucine , K=Lysine, 
L-Lcucine, M=Methionine, N=Aspariigine, 
P=Prcline, Q=Glutanune , R=Arginine, 
S-Seiine t T= Threonine, V= Valine, 
K=Tryptophan, Y=Tyrosine, 2=Unknovn, *=Stop 
Codcn, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








NRDVAKSVLKERGZ.KKIRLGIEGYFTVKEKVKKRPGGRPBVIYN 
Y VQR P F I R MS WE K KEG KSR HVD FQCVKS KS I TNLAAAAADI pQD 
OLWMH PTPQVDELD I LP1HP PSGNS DLDPDAQN PML 


6561 




1086 


PGRKFKKKESSSSKWFPADCLl^GLRGPASSLLSPEPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGKLSEKYVFITGCDSGPGNLL 
A KOL VDRGMQ VLAAC FTEEGSQKLQK D7SYRL0TTLLDVTKSES 
IKAAAOWVJRDKVGEOGLWALVNNAGVGLPSGPKEWLTKDDFVKV 
I WVNLVGL2 2VTLHMb?MVKRARGRVVNMSSSGGRVAV3 GGGYC 
VSKFGVEAFSDSI RREbYYFGVKVCI 1 EPGNYRTAILGKENLES 
RMRKLWERLPQETKDSYGEDYFRIYTOKLKNIMOVAKPRVRDVI 

HCMCU» 1UCBCOD 7 V VMD/~*T niilfT IVTD7 & It T .DTD VT^lPTT CD V 

LPRPAUSV 


6562 




1562 


MSTLYDJRAHKACLLRFFASSDSNKALEQRRTLHTPJCLEHJjDRV 
L Y EW FLG KRS EG VPVSG PML I EKA KD r Y EOMQLTE PCV FSGGWL 
WR FKARUG I KKLDAS S EKOSADHQAA EO r CAFFRS LAAEHGLSA 
E2VYNADETGLFWRCLPNPTPEGGAV PGP KQGKDRLTVLMCANA 
TGSHRLKPLAIGKC5GPRAFKGIQKLPVAYKAQGKAWVDKEIFS 
DWFHH2 FVPSVREHFRTIGLPEDSKAVLLLDSSRAUPQEAELVS 
SrJVFTIFLPASVASl.VOPMEOGIRRDFMRiJFlNPPVpLOGPKAR 
YNMN DA 1 FS V ACAWN AV PS HV F R RA W R X LW PS VA FAEG S S S EE E 
LEAECF P VKPHNKS FAH I LELV KEG SSC PGQLRQR OAAS WG VAG 
RFIAEGGRPPAATSFAEWWSSEKTPKADODGRGDFGEGEEVAWE 
OAAVAFEAVl.RFAEROPCFSAOEVGObKAIiRAVFRSOOOVRRRR 
GALGAWKVEALOEGPGGCGATAQSPLPCSSTAGEN 


6563 


2379 


2694 


I^^PAOrVLLREPEGAGPPVPAGHLVK>iLOGG>ILiRERA.HPDLEA 
HEHPLPCDOMFWRQNGGKLRMVEANSRGWWGIGYDHTAWVYTG 
GYGGGCFOGbASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 
LPTDR YW WS DASGLCECTKAGTKPPS LOWAVrVS DWPVD FS VPGG 
7DOEGWQYASDFPASYHGSKTKKDFVRRKCWARKCKLVTSGPWL 

fcVrrlALnUVoI 1 rhor'bAtoouno I /UjWAV^I'WjI-' VuLKbo 

FLN PAGSSW LKVGTDOP FAS I S I G ACYC VWAVARDGS AF Y RGSV 
YPSQPAGDCWYHIPSPPRORLKOVSAGOTSVYALDENGNLWYRO 
GITPSYPOGSSWEHVSNNVCRVSVGPLWVWVIANKVOGSHSLS 
RGTVCHRTGVQPHEPKGiiGWDYGl GGGWDHI SVRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6564 




5-75 


APGSCALWSYCGRGWSRAMRGCQLLGLRSSWPGDLLSARLUSQE 
KK AAETHFG PETVSEEEKGGKVYQVFES VAKKYDVMNDMMS LG I j 
HRVWKDIAjLW KMHPLPGT01.LDVAGGTGDIAFRFLNYVOSOHQR ! 
KOKRQLP^QCNLSWEEIAKEYONEEDSLGGSRVWCDINKEMLK j 
VGK0KA1A0GYRAGLAWVLGDAEELPFDDDKFDI YTIAFGI RNV \ 
THI DOALOEAHRVLKPGGRFUTLEFSOVNNPLl SRLYDLYSFQV 
1 FVLGEVI AGDWKSYQYLVESI RRFPSQEEFKDMI EDAGFHKVT 
YES LTSG I VAIHSGFKL 


6565 


1464 


995 


KSAVANGLTKRRMGLKLNGR YI SLI LAVOI AYLVQAVRAAGKCD 
AVFKGFSDCLLKLGDSMANYPOGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIOGSLFELCGSGNGAAGS 
LLPAFP VLLVSLSAA1ATWLS F 


6566 


3 


1385 


KYESAOPGGTOPEPGLGARMAIHKALVT/.CLCLPLFLFPGAWAQG 
H VP PGCS CG LNPLY YNLCDRSG AWG I VLEAVAGAGI VTTFVLTI 
I LVASLP FV QDT KKR SLLGTQV FFLLGT LG LFCLV FACVEK P D F 
STCASRR FLFGVLFA I CFSCLAAHVFALNFLARKNHGPRGVTVI F 
TVALLLTLVE V IJNT5WLI I TbVRGSG EGGPCGNS SAGKAVAS P 
CAIANKDFVMALIYVWLLLIjGA^ 

LLTTATS VAI WWW 1 VMYTYGN KQHNS PTWDD PTIA I ALAANAtf 
AFVLFYVIPEVSOVTKSSPEQSYCGDMYPTRGVGYETILKEQKG 
QSM FVEN KAFSMDEPVAAKRPVS P YSGYNGQLI/rS VYQPTEMAL 
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BNSDOCID: <WO_01S3312A1 J.s 



WO 01/53312 



PC77US00/34263 



1L 

SO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 



6567 



125 



Prea;cL«»d end 
nucleotide 
2 oca t ion 
corrc spending 
to first 
amino acid 
residue cf 
amino ecic 
secuence 



6569 



205 



6570 



330 



169 



6572 



49 



1 52; 



I Am no c.c:c seamen: conta ining s ignal - pept - e7»~ 
(A=Alanine> C=Cystoine, D^Aspartic Acid, fc s 
Glutamic Acid/ F=Phenylal anine, G=G}ycirje, 
K^Histidine, I-3sOi eucine, K=Lysine, 
L=Leucine, M=Nethioni ne, N^Asparagine , 
P = Proljne, 0=Glutamine, R=Argdmne, 
S=Serine, TV Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«Stcp 
Codon, /=possibie nucleotide deletion, 
\s=poesible nucleotide insertion) 



MHKVPSEGAVDI I LPRATANSQWGSANSTLRAEDttYSAQSHCA 
ATPPKDGKNSQVFRNPYVYJD 



T KRSN LKAY ACS I HK 3 R7MSYV F V N DSSQTNV P LLQAC I DGDFN~ 
YSKRLLESGFDPMI RDSRGRTGLHLAAARGNVP1 CQLLHKFGAD 
LLATDYQGWTALHLCGHVDTIQFLVSNGI.K1DICNH0GATPLVL 
AKRRGVNKDVIRLLESLEEQEVKGKNRGTKSKLETMQTAESKSA 
MESHSLLNPML0OGEGVLSSFRTTW0EFVEDLGFWRVLLLIFV1 
ALLSLGIAYYVSGVl.PFVENOPELVH 



HASVRLLVLPWYSHFS0*SANLQGPSR7TELFHPTLASISSPjr 
LEGAEl.YFNVDHGyj.EGLVRGCKASLLTQCDVINLVCCETLEDL 
KIHLOTTDYGNFLANIHTNPLTVSK1DTEMRKRLCGEFEYFRKHS 
LE PLS TFLTY WTCS YK1 DNV 1 LbMNGALQKKSV KS1 LGKCHPLG 
RFTEMEAVNIAETPSDI.FNAILIETPI^APFFODCMSEKALDELN 
3 EL1>RNKLYKSY: j EAFYKFCKNHGDVTAEVMCP1 LEFEADRRAF 
1 1TLNSFGTELSKSDRETLY PTFGKLYPEGLRLIoAOAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVOM-WLAFNROF 
HYGVFYAYVKLKE0EJRN3 VWIAECIS0RHRTKIN3YIPIL 



Wrgporlghgrptpllcri^tagpshwekoarafogTrpvdpr 
rkswlfpltksasssaagspggltsloookcrlleslrnshs^l 
ae1qkdveyrlpftinkl7ininjllpp0fpoekpv3svyppir 
hhlmdkq3vy vts plwnftmhsdlgk i iqslldefwknppvba 
ptstafpylysnpsgmspyasogfffbppyppqeanrsltslsv 

ADTVSSSTTSHTTAKPAAPSFGVLSNLPLP1 PTVDAS1 PTSQNG 
FGYKMPDVPDAFPEl>SELSVSgLTDMNEOEEVLLEQFLTLPOLK 
QIITD KODLVKSI EELARKNLLl iEPS LE AKRQT VLDKYELLTQM 
KSTFEKKMOROHELSESCSASALOARLKVAAHEAEEESDNIAED 
FLEG KM E I DC FLSS F»^EKR T I CHCR RAKEE KL00A I AMHSQ FHA 
PL 



ARiPRLTFLREGFL.YVUjSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGSKALPAPI PLHPS LOL3TNYS FLOAVNTFPATVDHbQG 
LYGLSAVOTt^MflHWTLGYPNVHEITRSTlTEMAAAOGI .VDARF 
PFPALPFTTHL.FHPK03A1 AHVL.PAI.H KDRPR FDFANLAVAATQ 
EDPPKMGDbSKLSPGLGSPI SGLSKLTPDRKPSRGRIjPSKTKKE 
FICKFCGRHFTKSYNLL3HERTHTDERPYTCD1CHKAFRFODHL. 
RDHRY 1 KS KEKPFKC0ECGKGFCCSRTLAVHKTLHMQTS5PTAA 

SSAAKCSGETVICGGT 

APDMNRKKL0KLTDTI.TKKCKHLFRGFDKDNDGCVNVLEW1HGL 
SLFLRGSLEEKMKYCrEVFDLNGDGFISKEEKFHMl.KNSLLKQP 
SEEDPDEGI KDLVE I TLKKMDHDHDGKI^ FADYELAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNEFNDM 



TP ERAQ PGALLGAAGCCVCGGR KWPRS HERG YF£ SAKMGSKRR1C 
I^CSERHQKLWBrryCKKLHVOALKNVNSOlRNOMVONENDNRV 
ORKQFLRLUjmsOFEliDMEEAlOXAEENKRbKElXJLKOEEKLAM 
ElAKLKHESbKDEKMRQOVRENS I ELREbEKKLKAAYMMKERAA 
QIAEKDA2 KYEQMKRDAEI AKTMMEEHKRI I KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAY EQLL.KEKLMI DE I VRKI YEED 
Ot.EKQQKUEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKi: 
EFAhl^QQQREEDRMAKVOE^EEKRLOLONAXTOiaEEMLRORED 
LEQVR0ELYOEEQAEIYXSKLKEEAEK1CLRKOKEMKODFEE0MA 
LKEbVLQAAXEEEENFR KTMLAKFAEDDR I ELMNAQKQRMKQLE 
HRRAVEKLIESRROOFLADKORELEEWOLOORROGFINAIIEEE 
RLKbLKEHATNLLGYLPKGVFKKEDDIDbLGEEFRKVYCXJRSEJ 
CEEK 



6573 



"76T 



27S 



GGGGGESQSF^OCGTRTPATlXrLMYWPRKLMlWGYDMVOK 
LFLDFFK R R L£ QR PT AEE LEQRU I LKPRNEOEEQEEKREI KRRL 
TRXLSQR PTVEELRERXI U R FSDYVEVADAQDYDRRADKP WTR 
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BNSOOCID.<WO 0153312A1 I. 5 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
} oca 1 1 or. 
corresponding 
to firs; 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secment containing caynai peptide j 
(A=A3anine, C=CyGteine, D-Aspartic Acid, E» ! 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, Isrlscleucme, K-Lysine. i 
L=Lei:cine, M=Methionme, N=Asparagine , 
P^Proline, Q=Glutcmine, R=Arginine, 
S=serine, T-Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine, X=Unkncwn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTAADKVSRGECWKVGGRTVCWVSLGSPLGSV 


6574 

t 


204 


1159 


LESSVPVSVGVFWACGVSWTGAAGLQDGALSDTMARiJAEKAMTA 
Uu^FRQAQLEEGKVKERRPFLASECTELPKAEKWRRQIIGEJSK 
KVAOI0NAGLGEFRIRDLNDEINKLLREKGHWEVKIKELGGPDY 
GKVGPKMLDHEGKEVPGN T RGYKY?GAAKDLPGVRELFEKEPLPP 
PRKTRAXLMKA1DFEYYGYLDEDDGV1VPLEQEYEKKLRAELVE 
XWKAEREARLARCBKEEEEEEEEEIN3YAVTEEESDEEGSQEKG 
GDDSQOKFIAHVpVPSOOEIEEAbVRRKKMELLQXYASETLQAO ' 
SSEARRLLGY ] 


6575 


117 


62C 


S PAUASQSGG I TEE WLLPQENGV I DLPDY EH VEDET FPPFPPp j 
ASPER0DGEGTEPDEESGNGAPVPVPPKRTVKRN1PKLDAQRL1 1 
SERGLPAI>RHVFCKAXFKGKGHEAEDLKMLI RHMEHWAHRLFPK ! 
LQFEDF1 DRVEYLGSKKEVQTCIjKRIRLDLPILHEDFVSMNDEV ' 

aennehd vtsteldp fhtnls e s em fas els i s lte e0oor i er 
nxqlalerrqaku 


6576 


1 


1060 


p e pqalvgq kkgalr 1 ,lv arlv ltvs apae vr r rvlr pvlswmd 
retraladshfrglgvdvpgvgoapgrvafvsepgafsyadfvr 
gfllpnlpcvfssaftogwgskp.rwtpagrpdfdhllrtygdv 
wpvancgv0eyksmpke}?mtlrdyitywkey10agyssprgcl 
ylkdwulcrdfpvedvitlpvyfssdwi.nefwdaiidvbdyrfvy 
agpagswspfkad1 frs fswsvnvcgrkkwi>lfppgcf.ealrdr 

HGNLPYDVTSPALCrjTHl.HPRMOTjAGPPuEITQEAGEMVFVPSG 
WHHQVKNLVMCCFSCPLSGAFLOEDGSTTSPJ.yQPELGWNGVAH 
G 


6577 


22 71 




SDRr^ASDEFDlVIEAMLEAPYKKEEDEOORKEVKKDYPSNTTSS 
TSNSGNETSGSST1GETSNRSRDRDRYRRRKSRSRSPGR0CRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKSPVREPVDNLSPEERDARTVFCMOIiAARIRPRDLEDFFSAV 
GKVRDVRI1SDRMSRRSKGIAYVEFCSI0SVPLAIGLTGORLLG 
VPI 1 V0A^0AE}OmAA^lANNL0><GNGGPMRLYVGSL < H?^^I TED 
MLRGI FEPFGKI DN1VX»MKDS07GRSKGYGFITFSDSECARRAL 
EQLNGFELJ^GRPKiRVGKVTERlXGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAAIiQLNGAVPLGA 
LNPAALTALS PALWLASOCLQLSS LFTPQTM 


6578 


377 


148<> 


PSSSATMMRAPL KRAT1 LHMALTGASDPS AEAEANGE K P FLLRA 
LQIAJjWSLYWVTSl SKV FLNKYLbDSPSLRLDTPl FVTFYQCL 
VTTLUCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSVVFIG 
MITFNNItCLKYVGVAFYNVGRSLTTVFNVLLSYLLLKOTTSFYA 
LLTCGI 1 1 GG F V?LG VT)OEG AEGTLS WLG TVFGVLAS LCVS LNA 1 
YTTKVLPAVDGS I WRLTFYNNVNAC1LFLPLLLLLGEL0ALRDF 
AQ1>GS AH FWGMMT1>GGLFGFA I GYVTGLQ1 KFTS PLTHNVSGTA 
KACAQTVUVVLYYESTKSFLWWTSN^VLGGSSAYIWRGWEMK 
KTPEBPSPKDSEKSAMGV 


b 3 / J 


2 




DDDDUUVDFT Cpt CtthHOUWQUDTaPfJtMUPVFT^^IVKQ^JVYT 
KrrKVW X r rjJbKT.L>o>V\ftV'l\>4oni\l Hrv»~I"iYr ir 1 ooo v ix^>o/*x i 

IYMGKDKTENEDLIKHGWPEDIWFHVDKLSSAHVYLRLHKGENI 

EDIPKEVJ^IDCAHLVKANSIQGCKMNTW^^ 

DVGQ IGFH RQKDVKI VTVEKKVNE I LNRLEKTKVERFPDLAAEK 

ECRDREERJ^EKKAOIOEMKKREKEEMKKKREMDELRSYSSLMKV 

ENMSSNQDGNDSDEFK 


6580 


62 


1571 


LVALKNWKPKGTN1PAPOSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPOEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFROFGYKDTPG 
PREALS0LRVLCCEWLRPE3HTKEQILELLVLEQFLTILP0BL0 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNECKPVME 
KISSSGTAKESPSSMOPOPLETSHKYESWGPliYIOESGEEOEFA 
0DPRKVRDCRLSTOHEESADEOKGSEAEGLKGDI3 SV3 IANKPE 
ASLEROCVMliENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYl 



525 



BMSDOC1D: < WO 01 5331 2Al J_> 



WO 01/53312 



PCT/US00/3-J263 



SEQ 
3D 
NO: 


Predi cte<: 
beginning 
nucleotidf 
location 
corresponding 
to first 
amino acid 
residue oJ 
amino acic 
sequence 


Predicted imd 
nuc) eotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic secmpnt containing signal peptide 
(A«Alanine, C=Cysteine, D^Aspartic Z^cid, E» 
Glutamic Acid, F=Phenylal amne , G=Giycint. 
H&Histadine, 1 ~ 1 soleucine , K-Lysir.e, 
L=beucine, M=Methionine, N^Asparag int . 
P=Proline, Q=Glutamine, ReArginine, 
S=Serine, T=Threonine, V= Valine, 
WiTryptophar., Y= Tyrosine, X=Unknown, ♦ -St op 
Codon, /=ooEsible nucleotide deletior. , 
Vpossible nucleotide insertion) 








CAECGKAFSNSSNLT^RRTHTGEKPYVCTKCGKAFSHSSNLTL 
HVRTHLVDRPYDCKCGKAFGQSSDLLK.HORMHTEEAPYOCKDCG . 
KAFSGKGS LI RHYR I HTGEKPYQCNECGKS PS QHAGLSSHQRLH - 
TGEXPYKCKECGKAPhrHSSNFNKHirR2HTGEKPywO:HCGKTFC 
SKSNLSKHQRVHTGEGSAP 


6581 


22b ! 476 
1 


RVFLKDLSSTPMASNNTA51 AOARKLVEOLKMEANJDR 1 KVSKA 
AADLMAY CEAHAKEDPLLTPVPASENPFREKKFFCA 3 I- 


6562 


1428 


718 


CFTTKTKCS PVSVP YhS PLVLR KELES LLENEGDCV 1 K7SS F I K 
OHPIIFV\'TLVWYFRRLDLPSNL?GLILTSEHCNEGV01'PI-S£LS 
QDS)CLVY10LLWDNlNLHOEPRE?LYVCViRNFNSEKK£SbLSEE 
QOETSTLV ET 1 R13S 1 OHNNVLKP I NLLS OOMKPGMKROR S LYRF. 
ILFLSLVSLGRENIDaEAFDNEYGIAYNSLSSEILEKLQKIDAF 
PSASVEWCRKCFGAPLI 


6583 


48"/ 


41 


RIFSMTSGRLRWRCTWRPATAI.WSASI.RLGTSSMHFi'PRSlSLP 
LSMMI.SPLPSNTRGLSPTALFRSPDSEHAT.SCPRLKI.WRCRAPL 

13 CTJCT3T /"OI fWTX DO C DT vnTUTlTMC^ VP\ TH1 nl/HDC C CfZTC Vh f~ 
Korb r'JjLrKijU V Ltr nor Lit* vn J liW£»v>r^r. v 1A?J-»VJ vyKo K.ik>llj r/iV- 

S0AGSGAVQGGNWC3 F 


6584 


189 


1750 


PLPMAALGPSSQWTEYVVRVpKNTTKKyNlMAFNAADKVNFAT 
WNOARLERPLSNKKlYOEEEKPESGAG^tFNRKLREEARRKKYG 
I VLKEFRPEDOPWLLR VNGKSGRKFKGI KKGGVTENTS Y YI FTQ 
CPDGAFEAFPVHNWYNFTPLARHRTbTAEEAEEEWERRN^LNH 
FS1MO0RRLKDQD0DEDEEEXEKRGRRKASELRIHD1 .KDDL.EMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSr.DEAFEDS 
DDGDFEGOEVDYMSDGSSSSCEEPESKAKAPOOEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRiOSSEEF.DSSE 
ESDIDSEA£SAFFMAKKKTPPKRERKPSGGSSRGNSI< PGTPSAE 
GGSTSSTLRAAA S KLEQGXJtVSEMPAAKRI .R LUTGPCS I * G KST 
POPPSGKTTPNSGDV0VTEDAVRRYLTRKFMTTKDLLKKFQTKK 

J oLiOij Ly 1 v|i v urtyi XjJVKLmi x CO Al ii I'uixj'inr »j u»c< 


6585 


3 


16 7* 


GPJRNSRI DDFVGGDPRAEASC5 VLHSK KRAMADS R D P A S DQMQ 
HWKEORAAOKADV) .TTGAGNPVGDKLNVITVGPRGPLLVCDVVF j 
TDEMAHFDRBRI PER WHAKGAGAFGY FEVTHDI TK YS KAKV EE < 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGrWDL - 
VGNNTP 1 FF I RD P I LFPS F I USQKRN PQTHLKDPDM VWD FW S LR 
PE5LHQVSFLFSDRG3PDGHRHMNGYGSHTFKLVNANGEAVYCK 
FHYKTDQGI KNLSVEDAARLSQEDPDYGI RJDLFNAIATGKYPSW 
TF^IQVMTFKOAETFPF^IPFDLTKVWPHKDYPLIPVGKLVLNRN 
PVWYFAEVEOIAFDPSKMPPG1EASPDKMLCGRLFAYPDTHRHR 
LGPKYLH I PVNCP Y R AR VAN YQRDG PMCMQDfvQGGA PNY YPNSF 
GAPEOOPSALEHS I OYSGEV.^RFNTANDDNVTQVRAFYVNVL.NE 
EORKRLCENI AGHLKDAQ1 F10XKAVKNFTEVHPDYGSH3 OALL 
DKYNAEKPKNA3 HTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPE0PA3STSTMPVSGTPAPNKKRKSSKLIMELTGGGOESSGL 
HU5KKI SVPRDVMLEELSLLTNRGS KMFKLROMRVE KF I Y ENH P 
DV FS DS S MDHFQ K FL PT VGGQLGTAGQG FS YS KSNGRGGS OAGG 
SGSAGQYGSDOOHHLGSGSGAGGTGGPAGCAGRGGAAGTAGVGE 
TGSGDQ AGGEGKH I TVFKT YI S PWERAMG VDPOQKMELG 3 DLLA 
YGAKAELPKYKSFNRTAMPYGGYEKASKRMTFQMPKV 


6587 


75 


1117 


KRVPSLGXMPECWDGEHD2ETPYGLI>HWIRGSPKGNR PAl L>TY 
HDVGLNHKLrCFNTF FNFEDMOE I TKHF WCHVDAPGOOVGASQF 
PQGYQFPSMEQLAAMLPSWQHFGFKYV3 GI GVGAGAYVLAKFA 
LIFPDLVEGLVLVKIDPNGKGWIDWAA7KLSGLTSTLPD1VLSH 
LFSQEELVNOTELVOSYRQO^GNVVNOANLObFWNMYKSRRDL.P j 
1NRPGTVPNA1<TLRCPV>u^WGDNAPAEDGWEC13SKIjDPTTTT 
FLKJW>SGGbP0VTQ?GKLTEJ\FKYFLOGKGYWPSASKTRLARS 
RTASLTSASSVDGSRPQACTHSESSEGLCgVNHTMEVSC 



526 



BNSDOCID: <WO 0153312A1J_> 



WO 01/5331? 



PCT/USW/34763 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotice 
location 
corresponding 
to fi>T5t 
amino aci<: 
residue o: 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


AmTrTc acid segment containing £:cnai peptide 1 
tr.-.-.i &nine, C=Cysteine, D=Aspartic Acid, E= 
GlutGnuc Acid, F*Phenylaianine , G^Glycine, 
H-Histidine, I«=Isoleucine , K=Lysi/it , 
b=--L^\jcine, M=Methionine, N=Asparaqinc . 
P-rrcime, Q=Glutamine, R=Argimn£, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Un<ncwn, *^Stop 
Cocori, /-possible nucleotide deletion. 
\=possible nucleotide insertion) 


b DO D 


13"/ 


5 01 


LG LC A C'LL E jRTNN Y Q1>S DE LR KNG V E LTSLRQKV A Y LDK E F S K 
AOK^.LSKSKKAOEVEVLLSENEMLOAJOjHSQEEDFRLOMSTL-MA 
E FS X LCSQMEOLEOENOQLKEGAAGAG VAQAG F 


6589 


* 


140S 


R PWG b" AMATFS R0EFFOOLLQGCLLPTAQOGLDC1 WLLLA1 CLA 
CR1 VaT* LG L ?S Y LKHAS TVAGGFPS LY HFFQLHKVW WLLSLLC 
YLVL FL CR HSS HRGVFLS VTI LI YLLMGEMHMVD 7VTWHKKRGA 
OM J V AMKA VS LG FDLDR GE VG T VPS P VEFMG YI..YFVGTIVFGPW 
1 S FHS YLQAVOGRPLSCRWLQKVARSLALALLCLVLSTCVGPYL 
FPYFI PLNGDR LLRNKKR KARGTMVR W I .RAYESAVS FH KSNYFV 
GFLrEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEW 
TS WN L?KS Y KLNNYVF itNALRLGTFS AVLVT Y AAF ALLHG FS FH 
LAAVLLSLAFITYVEHVLRKRLARILSACVLSKRCPPDCSHOHR 
LGLC »/RALNLLrGAliAI r HLAYLGSLr DVDVuuT 1 hbywi oMAY 
TVHKK'SELSWASHWVTFGCKIFYRLIG 


6590 


2171 


6S6 


VRA Y r;HVLS LLENV FTPMFCHRDE Y FRQLLRGAES PTRNS KLNR 
GSL5." >DDFRNTQXRGESFG1SR1GSKT KGVFXSTTMEGAMLPNY 
GVAEGEDnFIEEGIWMEDDSPVEAVSTPNTPRNLAAWIClSIPY 
VDFFKDPSS ERKKKKERIPVFCIDVERNDRRA VGHEPEHWSVYR 
RYLKFYVLE SKL.TEFHGAFPDAQLPS KRI 3 G PKNYE FLKS KR EE 
FOEYLOKLLQHPELSNSQLLADFI^PNGGETOFLDKILPDVNLG 
KI 3K£ VPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSLN T NKKLFNDLFKNNANRAENTERK0N0NYFMEW1TVEGVY 
DYLMY VGR WFQVPDWLHHLLNGTR I LFKNTL£MYTDYYLOCKL 
EQLFOEHFLVSLITLLRDAIFCENTEPRSL0DKOKGAKOTFEEM 
MNY" PDLLVKCI GEETKYESI RLLFDGLQOP VLNKQLTY VLLD3 
V3QELFPELNKVQKEVTSVTSWM 


6591 


2177 


6S6 


VRA Y F. H VLS LLENVFTPMFOJRDEY FRQLLRGAES FTRNS KLNR 
GSLSLDDFRNTQKRGESFGISRJGSKIKGVFKSTTMEGAM-jPNY 
GVAKGEDDF3 EEGI WMEDDSPVEAVSTPNTPRNIiAAWK I S I PY 
VD F FF D PS S E R KE KKER I P VFC I D VERNDR RAVGHE P E H WS V YR 
RYLEFYVLESXVTEFHGAFPDAQLPSKR11GPKNYEFLKSKREE 
FCEYLOKH OHPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
Kl 1 KSVP5KLMKEKG0HLEPF1MNFINSCESPKPKFSRPELTIL 
5PTS F K K LFN DLFKNN ANRAENTER Y FMEVKT VEGVY 
DYLKYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLOCKL 
EOLFC EHRLVSLITLLRDAIFCENTEPRSLODKOKGAKOTFEEM 
MNYH DLLVKC1 GEETKYESIRLLFDGLOQPVLNKQLTYVLLDI 
V1QELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSM1DANLKL.LCEAE0RLKAIVAEKFA1ATKEG 
it ,PPV~RFFKlrPT,ICl.'HFFf;L"RKFSEYLCKOVAS*<AEENLLMV 
LGTUK S DRRAAV I FADTLTLLFEG JAR I VETHOP 1 VET Y YG PGR 
LYTLI X YLOVECDRQVEKWDKF J KQRDYHOOFRHVONNLMRNS 
TTEK1 E PRE LDP J LTEVT1>MNARSELYLRFLXKR I SSDFEVGDS 
KASEE VKOEH0KCLDKLLNNCLLSCTM0ELI GLYVTMEE Y FMR E 
TVNKA VALDTYEKGOIiTSSMVDDVFYI VKKC1 GRALSSS S IDCL 
CAM 1 N LATTE LES D FRDVLCN KLRMG F P ATT FQE> I OR GVT S A VH 
WHS* LOOGKFDTKGIESTDEAXMS FLVTLNNVEVCSENI STLK 
KTLE5 DCTKLFSQGI GGEOAQAKFDS CLSDLAAVS W K FRDLLQE 
GLTELNSTAI XPQVQPW1NSFFSVSHNI EEEE FN DY EAND P WVQ 
QFI LNLEOQMAEFKASLS PVI YDSLTGLMTSLVAVELEKVVLKS 
TFN^LGGLOFTKELRSLIAYLT^VTTWTIRDKFARLSQfnATILN 
LERVTE I LD WGPNSGPLTWRLTPArVRQVLAl»R3 DFRSED1 KR 
LRL 


6593 


3 


1837 


EAFS?.GSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR 
RGGE^GGGRGRGDKRRRJ^QARROR^RPEPAEARGGKMADVLSVL 

rqyn: ckkei WKGDEVI fge fswpkwktnyvvwgtgkegqpr _ 



527 



BNSDOCID: <WO 0I53312A1J_> 



WO 01/53312 



|>CT/L'S0()/34263 



SEQ \ 
3D 
NO: 


Prerii ct.cc 
beoinnino 
r.uciect.i de 
] ocat ion 
corresponding 
to first 
Einino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotioe- 
locat ior. 
cor respond i ng 
to first 
amino acic 
residue c: 
ami no ac i c 
sequence 


Amino acid segment containing si anal peptide 
{A^Aa anine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylal a ni ne , G=Glycine, 
H=Histidine, I-lsolcucine , K=Lysine, 
L=Leucine, M=Mechioni ne, N-Asparagine . 
p=Proline, Q=Glutaroine , R=Arginme, 
S-Serine, T= Threonine, V=-Valint. 
W=Tryptophan, Y=Tyrosine, X=Un*nown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possibls nucleotide insertion) 








EYV'TLDSILFLLNN^niLSHPVYVRRAATENIPWRRPDRKDLLG 
YLNGEAST£AS1DRSAPLE1GLORSTOVKRAAEE\OAEAKKPR1 
EDKECVRLDKERLAARLEGHKEG3 VQTEQI RSLSEAMS VEK 1AA 
I KAK I MAKKRST I KTDLDDD1 TALKORS FVDAEVDVTRDI VSRE 
RVWRTRTTILQSTGKflFSKNJ FA I LOSVXAREEGRAPEQRPAPN 
AA P VD PTLR TKQ P 3 P AA YNRY DQE R FKG KEETEG FK 1 DTMGTYH 
GMTLKSVTEGASARKTOTPAAOPVPRPVSOARPFPNOKKGSRTP 
3 I 3 IPAATTSMTMLNAKDLLODLKFVPSDEKKKOGCQRENETL 
3 ORRKDQMOPGGTAI SVTVPYRWDOPLKLMPODWDRWAVFVQ 
GFAWQFKGWPWLLPDGSPVD3 FAK3KAFHLXYDEVRLDPNVQXW 
DVTVLELSYHKRHl.DRPVFLRWETbDRYMVKHKSHLRF 


6594 


1 


109t 


E F PGRR F RGS OAS P LCATCG P ALLRAF TRAAMTR S L F KGNF W S A ] 

T\7 1 CTiaYTiW T TOHT NUrPKNrKFFFnFt.KFR AATFFRYGKD1»I, 

NLSRKKPCGOSEINTLKRALEVFKQQVDHVAQCHIQLAQSLREE 

ARKMEEFREKQICLQRKKTELIMDAIHKOKSIiOFKKTMDAKKNYE 

0 K CR DK D E AE QA VS R S ANL VN PKQQEKLFVK LATSKT AVEDSD K 

AYMLH1GTLDKVREE17QSEH3KACEAFEAOECERINFFRNALWL 

H\^QLS0OCVTSDEMYE0VRKSLEMCS3ORniEYFVNORKTGCI 

PPAPIMYENFYSSOXNAVPACKATGPNI^RRGPLPIPKSSPDCP 

NYSLVDDYSLLYQ 


6595 


57 


781 


¥ IAj 1 rli> L S U 1A> i U c.<j 1> Li> l«Ao a K K. K :< \j>NUt'lsJ^^vi\. ± lj-.UWXjJi.fi 
RYNAYPSE0EK1^LSG0TNUSVL03CNWF1NARRR1,LPDMLRKD 
GKDPNQFTlSRRGGKASDVAiPRGSSPSVLAVSVPAPTNVbSLS 
VCSMPLKSG0GEKPAAPFPRGE1.ES PKPbVTPGSTbTbbTRAEA 
GSPTGGLFNTPPPTFPEQDKEDFS5F0LLVEVAI.QRAAEMEL0K 
OQDPS LPLLHTP I PLVS ENPO 


€596 


2 


1Q2K 


PRLPVRRYhGHR^bOGRSRGHMAEGDAGSDORQNEElEAMAAlY 
GEEWCVIDDCAKIFC3RISDD3DDPKWTLCLOVMLPNEYPGTAP 
r-rvoi Mz*t>WT.KrT>FPAnT QN<SI,FETYTONIGES JLYLWVEK1RD 
VL10KS0MTEPGPDVKKKTEEEDVECSDD13LAC0PESSVKALD 
F^3SETRTEV£VEELPF3DHG3P1TDRRSTF0AHLAPVVCPK0V 
XMVLS KLYENK KI ASATHM I YAYR 3 Y CEDKOT FliQDCEDLXSETA 
AGGRLLHLMEI LKVKNV>tVWSRWYGG I LUG FDR FKHINNCARN 
I LVE KN YTNS P EESSKAI G KJJK K VR K D K KRNEK 


6 D7 1 


c 


1026 


PRLPVRJ*YT^GRRR1jC^ESRGHMAEGDAGSDQR0N^ 
GEEWCV I DDCAKI FC3 R1SDD3 DDPKWTLCL<?VWL?NEYPGTAP 
P3 YQLNAPWLKGQERADLSNSLEE 3 Y 1 ONJGES J LYLWVEKI RD 
VL30KSC»MTEPGPDV7(KKTEEEDVECEDDIilI>ACOPESSVKALD 
FDI S ETRTE^'E VEBIjPP 3DHGI P3 TDR RS TFOAHbAJPVVCPKOV 
KMVLS KLYENKK1 ASATKNI YAYR3 YCEDKQTFLODCEDDGETA 
AGGRbLHl^E3Lr^raW^^m75RWyGGILbGPDRF)QlI^JNCAR^} 
I bVEKN YTNSPEESSKALGKNKKVRKDKKRNEK 


6598 


1099 


41S 


PRVRWATTMAKjSFEMPWOYRFPPFFTLOPNVDTROKObAAWCSL 
VLS FCRLHKOS SMT VMEAQESPLFNNV KLQRKLPVES IQ1VLEE 
bRKKGNLEWLDKSKSSFLlMWRRPEEWGKLIYQWVSRSGCNNSV 
FTL Y E LTNGEDTEDE E FttGL DEATLLRALQAI.QQEH KAE 1 1 TVS 
IXJPRROVLI^GTCbPLbbTSHI^RAFKRROTOCPPKTGSVTPPD 
SKGLQS 


6599 


164 


159} 


KMAALTTLFKY 3 DENC/DRYI KKbAXKVAI OS VSAWPEKRGEI RR j 
MMEVAAAJDVKOLGGSVELVDIGXOKLPDGSBI PLPPI LLGRLGS 
DPOKKTVCiyGHLDVOPAALEDGWDSEPFTLVERLXSKLHGRGST 
DDKGPVAGWINALEAYOKTGQElPVNVRFCliEGKEESGSEGbDE 
L1FARKDTF FKDVDYVCISDNYWLGKKKPClTYGbRGlCYFPlE 

VE CSNKDLHSGVYGGS VHEAMTDbl LLMGSLVDKRGN 1 LI 

EAVAAVTF.EEHKLYDD3DFDIEEFAXDVGAQILLKSHKKDILMH 

RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 



528 



BNSOOCID: <WO 0153312A1_L> 



WO 01/53317 



PCT/USOO/34263 



SBO 
ID 

DO ■ 


Precis ctec 
bee j nn l nc. 
nuc j eot iae 
leca t i on 
corresponding 
to first 
ami no acic 
residue of 
amino acic 
secv.enc* 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Ammo acid secnent containing signal peptide 
(A-Alanine, C-Cysteine, D=Asportic Acid, E= 
Glutamic Acid, F- Phenylalanine , G-Glycin^ ( 
H*Histidinc, 1= Isoleucine , K=Lysine, 
L^beucine, M=Methionine, N=Asparagine , 
Pe Proline, Q=Glutamine , K-Arainine, 
S=Serlne, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion! 1 








VVGEQVTSYLTKKFAELRSPNEFKVYMGKGGKPWVSDFSHPHYL 
AG R RAMKT VFG VE PDLTRE GGS I P V T LT F QE ATG KNVML L PVG S 
ADDGAHSQNEKLNRYNYIEGTKMLAAYLYBVSQLKD 


6600 




934 


PGRliFRVAAMESAGLEQLLRELLLPDTERIRRATEOLQIVLRAP 
AA LS ALCD L LA S AAD PQ I R OFAA V LTRR R LNTR WRR LAAEQR E S 
LKSLIbTALQRETEHCVSLSLAQLSATIFRKEGLEAWPQLLOLL 
0HSTHSPHSPEREW5LLLLSVWTSRPEAFQPHHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYuSTEDVPIARMLVpKLIMAMQ 
TTtT PinFAKAC^AJ.FM.nFIiljE^FV^VTTPYL^EVLTFClAWAR 
NVALGN A I R I R I LCCLTFLVKVKS KALLKNR LLATLAAHPFPHC 
GC 


6601 




1420 ~ 


PRAA ARAP P PAVLRR DRRAATAPGAG t MTLHGPLAORY F LNH IE 
K1TTWQDPRKAMNQPLNHMNLHPAVSSTPVP0RSMAV5QPNLVK 

R LQR I OMERER I RMRQEELMROEAALCROLPMEAETLAPVOAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGL£LGCYSV 
PTTPEDFLSNVDEMDTGENAGO'rPMNlNPOOTRFPDFLDCLPGT 
NVDLGTLESEDLIPLFNDVESALNKSEPFLTWL 


[ 6602 


127 


617 


LLD F PALP K FVLAQS PKAGK PSTMTS MTQSLRE V 1 KAKTKARNF 

TT* r~ \ / I Ol/ T T»T 17f> Ti A T"V~» ls\f T PCMVlIPPniTM M(?Tt . T T'JT! IfT 

r.H VJLA>K.l 1 .L»Vi>A/U'oKV 1 vfcrJJwr < r..fc-li IrJft-LOl JjMv>o.Li 1>S 1 LiVLI 

N I S TMALLCTE RGAPGVS VDMN1 T Y KS PAKLGED 1 VI T AHVLKQ 
GKTLAFTSVDLTNKATGKLIAQGRIITKHIjGK 


6603 


7S 


660 


PVGP SSLAARTGLGHLPFLHRLAS S RGbDMDLLQ FLAFI >F VU,L 
SGMGATGTLRTS LDPS LE I Y KKMFE V KRREOLLALXNLAOLKDI 
HGOY K I LDVMLKGLFKVLEDSRTVLTAADVLPDG P?POH)EKLKD 
AFS H W ENTAF FGDWLRF PR I VHYY FDHNSNWN LLIRWG ISFC 

Mr > Tr*VPMfV~'DUCt3TT CT.M 


6604 




68c 


TSTAQROGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNKFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLLGEFL 
HPCEDDI VCKCTTDENKVP Y FNAPVYLENKEQ1 G KVDE1 FGQLR 
DFYFSVKLSENMKASSFKKLOKFYIDPYKLLPLO^FLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6 6 05 


7 


B46 


SGS R RGAMRAAG VGLVDCHCHLGAPD KDRDLDDVLE KAK KANW 
ALVAVAEHSGEFEKIMOLSERYNGFA'LPCLGVHPVQGLPPEDOR 
SVTLKDLDVALPl IENYKDRLLAIGEVGLDFSPR FAGTGEQKEE 
QRQ VL I RQ I OLA KRLMLP VNVH S RS AG R PTI N LLO EQG AE KVLL 
HAF DGR PS VAMEGVRAGY FFS IPPS1 X R SGQOK I .VKQLP LTS I C 
LETDSPALGPEKQVRNEFWNISISABY1A0VKGISVEEV1EVTT 
QNALKLFPKIRHLLQK 


6606 


2 


1682 


FVEIRPRAEVANLSAHSASPIODAVbKRLSLLEDIVYROLNGLS 
KS LGLI EG YGGRG KGGLPATLS ? AE E E KAKG PHEKYG YNS YLS E 
KI SLDRS I PDYRPTKCKELKYSKDLP01 S 1 1 FI FVNEALSV1LR 
SVHSAVNHTPTHLLKEIILVDDNSDEEELKVPLEEYVHKRYPGL 
VXWRNQKR EGL1RAR I EGWKVATGOVTGFFDAHVEFTAGWAEP 
VLSR 1 QENRKRV3 LPS I DNI KQDNFEVQRYENSAHG Ys WELWCM 
YISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENIELG3KVWLCGGSMEVLFCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEW7MDDYKSH VYI AWNLPLENPG 1 D I GDVSER 
RALRKSLKCKNFOWYLDHVYPEMRRYKNTVAYGELRKNKAKDVC 
LD0GPLENHTA1 LY PCHGWGPOLAR Y T KEGFLHLG ALGTTTLiLP 
DTRCLVDNS KSRLP0LLOCDKVKSS L Y KRWN FI 0NG Al MNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWT1 KNSl K 


6607 


137 


386 


VPACAGLkKEARSLLAS PPRLLNTKLJQASCRALFS P P IQSROTT 
G I SFCGRGGAGPGVPTRTQVFAAMGAVMGTFS S LOTKORRPSKD 
K I EDELEMTMVCHR PEGLEQLEAQTNFTKRELG;VLYRG FKNECP 
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se'q 

ID 
NO: 


Predict tc 
beginning 
nucleot i r-e 
locat i or. 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequencf 


Predicted enc 
nucleot id* 
lccat i on 
corrK£ponc:ny 
to first 
amino acic 
residue oi 
amino acid 
sequence 


Ammo acid ?<anr,ent ccr.taininc sacnal peptide 
(A^Alersine, C-Cysteme, D=Ascartic Acid, E= 
Glutamic Acic, F= Phenylalan j r:e , G^Glycine, 
H=Histidine. l^lr,oleucine. X=Lysir>e, 
L=Leucine, K-Ket hior.i ne , N=Asparaaine, 
P^Proline, Q=G3utamane , R=Arqinine, 
£*Serine, T= Threonine , V- Valine, 
W= Tryptophan. Y-Tyrosme, XT=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








SG WN EDT F K0 1 Y AQF F PHGDA ST Y AH Y LF N A F DTTQTG S V K FE 
DF VTALS I h LRGT VH E XLR WT FN L YD I NK DG Y I NQEEMMD I VKA 
IYDMMGKYTYPVLKEDTPR0KVDVFFQKMDKNKDG1VTLDEFLE 
SCQEDDMMRS^QLFQNVK 


6608 


22 4 


1140 


RPCFSSPTGIiCPRLSYPMILLCHAVLPPPXQPSPSPPMSVATRS 
TGTLQLPPQKPFGOEASLPLAGEEELSKGGEODCALEELCKPIjY 
CXLCNVTLN S AOOACAHYOGXNKGXXLRJNY Y AANS CPP PARMSN 

wepaatfwpvppcmgsfkpggrvilatendycklcdasfssp 
avaoah yog knha krlrlaeacsns fs fsselgqrrar kegnef 
kmmpnrrhmytvonnsgpy fnprsrqr 1 prdlamcvtfsgqfyc 
smcnvgageeme frqhlesxohxs xvs eqryrnemenxgy v 


6609 




4 4? 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEORLSALTLLSWSAVT 
PAAEPGNFOl^PAEPRGPUASPVRAAPRAPCPAAEMSELiNTKTS 
PATNQAAGQEE KG KAGNVKXAEE EEE I D I DLTA P ETEKAALA1 Q 
GKFRRFOKRKKDPSS 


6610 


3l£ 




GKXSLCNLH3 F] RFPLTYPDMYKGMMCTAKXCGIRFOPPAI I LI 
YESEI KGKIRQRI KPVRNFSKFSDCTRAAEQLKNNPRHKSYLEQ 
VSLROLEKLFSr:,nGYL$GOGLAETME03 0RE*rrlDPEEDLNKL 

CGWDTESADEF 




976 


2i: 


rcCSGAGSRVWWl.PALRHLAMGSTESSEGRRVSFG'/DEEERVRV 
IXX? VR LSENWNK M KE PSS F ?P APTS S TFGLQDGN LRAPH KEST 
LPRSGSSGGQQPSGMKEGVKRYEQEHAA10DKLF0VAKREREAA 
TKHS XAS L PTG EG S I S H E EQ KS VR LAR E L ES KEAELR RRD T FY K 
EOLFRIERKNAEMYKiSSEOFHEAASKI'lESriXPRRVEPVCSGL 
QACILHCYRDRPHEVLLCSDbVKAYQRCVSAAHKG 


6012 


1724 


992 


VSTHASAl^RTOGOPOROPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAORVDDSPSTSGGSSDGDORESVQOEPEREgVOPKKKEGKl 
SSKTAAKLSTSAK R 1 QKELAEI TJbDPPPNCSAGPXGDNI YEWRS 
TILGPPGSVYEGGVFFLDITFSPDYPFXPPKVTFRTRIYHCNIN 
SOGV 1 CL-D1 LXDK ws PALT2 S XVLLS 1 CSLLTDCH PADPLVGS I 
ATOYMTNRAEHDRMAROWTXRYAT 


| 6 613 


130 


74 fc 


ELELSSNMPEQSKI)YRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVEDTYRQVISCDKSICTLOITDTTGSH0FPAMORUSISKGHA 
FILVYSITSRQSLEELiKPI YEOICEIKGDVESIP1MLVGNKCDE 
SPSREVQSSEAEALAR'J-WXCAFMETSAKLNHNVXELFQELLNLE 
KRRTVSLOIDGXKSXC»KRXEKLXGXCVI M 


6 614 

i 

i 


3 


119: 


SSAAEAMRVTjVRRCWGPPI*AHGARRGRPSPQWRALARLGWEDCF 
DSRVREK? PWRVLFrGTDQ FAR EALRALHAARENX EEELI DXLE 
WTMPSPSPKGLPVXOYAVQS0LPVYEWPPVGSGEYDVGWASF 
GRLLNEALI LKFPYG I LNVH PS CLPR WRGPAPV J HTVLHGDTVT 
GVTIMQIRPKRFDVGPILKOETVPVPPKSTAKELEAVLSRLGAN 
ML»I S VLXNLPESLSNGRQQPMEG ATYAPK J SAGTSC1 XWEEQTS 
EQ I FR LYKA I GN 1 1 PIiQTLWMANTl KLLDLVEVNS S VLADPKI>T 
GQALI PGS VI YHK0SQ I LLVYCKDGW I G VRS VMLXXSLTATDFY 
NGYLHPVTY0KNS0A0PSOCRF0TLRLPTXXKQKKTVAM0OC1E 


6615 


832 


35 


GRVG AG AS AMSELPGDVRAF LR EHPSLRLOTDAR X VRC I L7GHE 
LPCRLPELOVYTRGKKYORLVRASPAFDYAEFEPHIVPSrKNPH 
QLFCKLTLRHINKCPEHVLRHTOGRRY0R?vLCKYEECOKOGVEY 
VPACLVHRRRRREDQMDGDGPRPREAFWEPTS5DEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDXEDEKAXPPREKATD 
EGRRETTVYRGLVOXRGKKOLGSLKKKFKSHHRKPXSFSSCXOS 
G 


6616 


347 


168* 


LLPPCQGARPLSEPPHASEDNLFLFWNCILCAFPKPSPOPLOYP 
VWPLLLVIT0IPAPRHLRWRPFSFSRGGLDSFSGSLSTPS7CRS 
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src 

NC : 


Predi ctec 
beginning 
nucleotide 
lecat icn 
corresponding 
tc first 
amino acic 
residue of 
amino acic" 
sequence 


:- : edicted end 
r.uca eot ide 
jccatior. 
c cr responding 
tc first 
f-.n.: no acid 
rt-ridue of 
ettcno acid 
requence 


Ammo ac:d secncnt containing signal peptide 
(A=Alan:nc, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F^Phenylalanane, G=Glycine, 
H=Hist idine, 3 = 3 sol eucine, K=Lysine, 
L=Leucint, K=Methionine, K=Asparagine, 
P=Proline, Q^Glutamine, R«=Arcinine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y^Tyrosine, X^Unknown, **Stop 
Codon, Apossible nucleotide deletion, 
\=possib)e nucleotide insertion) 


" " — 






PAWVKMAPWPPKGLVPAVLWGLSI»n,NI,PGPIWLQPSPPPQSSP 
PPQPHPCKTCRGLVDSFNKGLERTIRDNPGGGNTAWEEENLSKY 
KDSETRLVEV/LEGVCSKSDFECH^LLELSEELVESWWFHKOQEA 
PDLFQWLCSDSLKLCCPACTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCOAGYGGhACGOCGLGYrEAERNASHLVCSACF 
GPCARCSGP£ESNCLOCKKGWALHHLKCVDIDECGTEGANCGAP 
OFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYOOVGSKC 
LDVDECETEVCFGENK0CENTEGGYRC1CAEGYK0MEG1CVKEO 
I PESAGFFSEf^TEDELWLOQMFFGI I 3 CALATLAAKGDLVFTA 
I FIGAVAAKTGYWLSERSDRVLEGFIKGR 




118 


671^ 


VWMAWQVSLLELEDRLQCF 1 CLEVFKESLMLQCGHSYCKGCLVS 
LS YHLDTKVR CPMCWQAVDGS S SLPNVS UWVI EALRLPGDPEP 
K VCVHHRN PLS Ij FCEKDQE L I CGLCGLLG S HQHH P VTP ISTVCS 
RMKEELAALrSELKOEOKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 




548 


:3fc 


DG KVARRAPNS PAFQND1 Y PLVS APRATTAES PWS KVLQNTQCR 
NVPKMTSERSRIPCLSAAA.^GTGKKOOEGRAMATLDRKVPSPE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRliNKEA 
WKYGT 


661* 


246 


842 


PASSEVLTAAVMFLLLNC1 VAVSQNMG 1 GKNGDLPRPPLRNEFR 
YFORMTTTSSVEGKONLVJMGRKTWFSIPbKNRPLKURlNLVLS 
R E J. KEPPOG AH FL^SLDDALKLTERPELANKVDMl VJ I VGGSSV 
YKEAMNHLGH LKLFVTR 1 MQDFESDTFFS EI DLEKYKLLPEYPG 
I LSDVQEGKK 3 KYKFEVCEKDD 


6 620 


3 


167S 


NS R VDDFVAXARKAAENEASOESALGAYS P VDYMS I TS FPRLPE 
DEPAPAAPIjRGRKDEDAFIjGDPDTDPDSFLKSARLQRLPSSSSE 
MGSODGSPLR ETRKDPFS AAAAECSCRODGLTVI VTACLTFATG 
VTVALVMQ IYFGDPQ3 FQQGAWTDAAR CTS LGIEVLS KQGSSV 
DAAVAAALCLG3VAPHSSGLGGGGVMLVHDIRRNESHL1DFRES 
APGALREETLORSWETKPGLLVGVPGMVXGLHEAHOLYGRLPWS 
OVLAFAAA VAQDG FJWrHDl^AflAJLAECl . P PNMS ERFRETFLPSG 

rpplpgsllkrpdi^aevldvlgtsgpaafyaggnltlemvaeao 
haggviteedrsnysalvekpvcgvyrghlvlspppfhtgpali 
salni leg fni.tslvs reoalhwvaetlk 1 alalasrlgdpvyd 
sti tesmddmi /skveaaylrghindscaapapllpvyeldgapt 
aaovblmgpddfivamvsslnqpfgsglltpsgillnsomldfs 
wpn rtanhs aps lens vqpgkr pls fllptwr paegi>cgtyla 
uja^gaarglsgltqvrftpwlaffsrepscgldcrclsylwlv 
siphaanmg 


662 U 


i 


662 


VQG 1 TSYQQR LiO ALR KE KS R DAAR S RRG KEN FE FY E LAKLLPL P 
AAI TSQLDKASI 1RLTI SYLKMRDFANOGDPPWNLRMEGPPPNT 
SV KV I GAQRRR S PS ALA 1 EV FE AH bGSH 1 LOSLDGYVFALNCEG 
KFLY I SETVS I YLGLSQVELTGSS VFDYVH PGDHVEMAEOLGMK 
LPPGRGLLSUGTAEIXJASSAbSSSOSETPEPVVCFPPAil^FLL 


662^ 




33S 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSC3 1 KR YCEKRFVSKYLAT1G1DYGVTXVHVRPREIKVK 
IFDMAGHPFFYEVRKPF 


662'> 


1866 


189 


KALFEKVKKFRLHVEEGDILYAMYVRQTVLKVI KFLI J I AYNSA 
LVSKVQFTVI)CWDIODMTGYKNFSCNWTMAHLFSKLSFCYLCF 
VSIYGLTCLYTLYWLFYRSLREYSFEYVR0ETGFDD2PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNKEWTPDKL 
roklotnahnrlelpl: MLSGLPDTVFE1TELQSLKLEI 1 KNVM 
3 pati aqldnlqelslhqcs vk i hs aai s flkenlkvls vxfdd 
mrelppwmyglrnleelylvgslshdjsrnvtleslrdlkslki 
1^1 kswskipoavvpvss^okmcihndgtklvwlr^lkkmtn 

LTELELVHCDLERIPKAVFSLLSLOELDLKENNLKS1EE1VSFQ 
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ID 
NO: 


Preci ctec 
be? inn i r.c 
nucleot idc 
1 oca Li on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuc.l ect ide 
3 oca t ion 
co i" r e r>pon d i ng 
to lirst 
amino acid 
resicue of 
anmo acid 
sequence 


Amine acid segment containing sion^: peptide 
(AsAl^nine, C=Cysteme, D<=Aspartic Acic, fc = 
Glutamic Acic, F^Phenylalanine, G=Cjycint, 
H=Kds t idine, 3 =3soleucine, K=Lysant, 
L=beucine, K=nethionine, N=Asparac: . 
P=Prcline, 0=Giutamme f R=Arginir»t . 
S=Serifle, T=7hreonine, V~Valine, 
VUTryptophan, Y-Tyrosme, X«Unkno*T>, *^Stop 
Codon, /-possible nucleotide delet-cn, 
\-possible nucleotide insertion) 




1 

1 

j 


HLKK LTVL K LKKKS 1 T Y I PEH I KKLTSLERLS FJT KIEVLPSH : 
LFLCNKlRYLDLSYNDIRFlPPEIGVLOSLQYFSriCNKVESLP 1 

delyfckklktlkjgk>islsvlspkignllflsyi,l.g:kgnhfei i 

LPPELGDCPvALKRAGLWEDALFETLPSDVREOMKTE 


6624 


236 


1766 


GS R 3GGGS R 1 PAVS TH VAPGRS VLRPFASGALR1 F f LVKALOGC 
RGRPSGLAHLSOETSHWRAKRSGRACLGDFPGEn.RSFIMKCTA 
REWbRVTTVLFMARAIPAMVVPNATbLEKLLEKYKrEDGEWWIA 
KOKGKR AI TDNDMOS 3 LDLHNKLRSQVYPTASNME > HTWDVELE 
RSAESKAESCVA-HHGPASLLPSlGQNLGAHMGRYKhPTFJlVOSM 
YDEVKnFSYPYEHF.CNPYCPFRCSGPVCTHYTOV\ r VATSNRlGC 
AINLCHKKN1 WGC3 WPKAVYLVCNYSPKGNWWGilAPYKKGRPCS 
AC P ?£ FGGGCRENLCYKEGSDRYYPPREEETNE 1 EUQQSQVHDT 
HVRTRE EDS S RNEV 3 £ AOQMSQ I VSCEVRLRDQCK07TCNRYEC 
PAGCbUSKAKVIGSVHYEMQSSaCRAAIHYGIIDNIGGWVDITR 
OGR KH Y FI KSNRNG I QT1 GK YQSANSFTVSKVT VCA VTCETTVE 
0 LtC P F H K PAS HC P R V Y C PR KLYAS KS TLCS CKWNS S L F 


6625 


1124 




PGPkGGGGSLl^TKALGRSRGLGMHPGPSSGGTEGCVPTALRPP 
GPLVPCTSDDNLLKNIELFDKliALRFHGRLLFLKDVLGDElCCW 
S FYGOGRK 3 AEVCCTS I VYATEXKQTKVEFPEAP 3 F3- ETLN3 L J 
YETPRGPDPALLEATGGAAGAGGAGRGEDEENREKFVRRIHVRR 
H3THDERPHGQQ3VFKD 


662G 

t 


3 


i4 9e 


SAVE?VYTDRFHblLG:SVEFbCSLRSDATMES37ACLHAU)Ab 
LDV P W p R S K i G SDQDSG 1 E LLN V LH R V 3 LTR ESPS3 CLAS LEW 
ROT 1 CAAOEHVKEKRRSAEVDDGAAEKETLPEFGECKJDTGGLVP 
GKSLVFATLELCVCI LVRQLPELNPKLTGSPGVKA3 KPQ1 LLED 
GSRLVSAALVILSEL PA VCSPEGS IS ILPTILYbTJCVLRETAV 
KLPGGQbSSTVAA£LOALKG3LSSPMARAEKSRTAKTDLbRSAb 
TT 3 L DC WI) P VD E 3'HQELDE VS LLTA I TVFI LS TS F FA' TT I PCLQ 
KRCIDKFKATLEIKDPWOIKTYObbHSIPOYPNFAVSYPYIYS 
LAS C 3 ME KLOE 3DKRK PENTAELE 3 FQEG 3 KVLE'J LVTVAEE HH 
RAOLVACLLP 3 LISF LLDENSLGS ATS 3 KRNLHDF A1,QNLM0 J G 
PQYSSVFKSLVASSPALKARbFAA3KGNCESVKVK3 PTSKYTKS 
PGKKSS3QLKTSKL 


6627 


1 


697 


GIPHLSSKDMTGTPGAVATRlX^EAPERSPPCSPSYbLTGKVMbL 
GDTGVGKTCFLI0FKDGAFLSGTF1ATVG3DFRNK\^TVDGVRV 
KLQ I WDTAGQE R FRS VTHA Y YRDAQALLLL YD I TN K5.' S FDN 3 RA 
WLTE3HEYAORDW3MLLGNKADMSSERV3RSEDGrrbAREYGV 
PFLETSAKTGMNVELAFLA3 AKELK YRAGHQADE PS P03 RD YVE 
SQXKRSSCCSFM 


6628 


1 


1861 


QCAE FGGGSGGGGGSGGGGSGGGRGAGGEENKEN E R i- SAGS KAN 
KEPGDS L SLE3 1X)3 3 KES0OOHGLRHGDFORYRG YC£ RROR RLR 
KTLNP KMGNRH KFTGKKVTEELLTDNR YLLLVLKDALRAWS YAM 
OLKOEANTEPRKJiFKLLSRLRKAVKJIAEELERXCESNRVPAKTK 
LEAQAYTAYLSGMLRFEHOEWKAAlEAFNKCKTlYEKbASAFTE 
EOAVbY^QRVEEISPN3RYCAYN3GDOSA3NEl,MOS^bRSGGTE 
GLUAEKbEAL3 TOTRAKQAATMSEVEWRGRTVPVKI I K VR3 FLL 
GLADN EAAIVQAESEETKERLFESMLS ECRDA3 Q WR EELKPDQ 
KQRDY 3 LEGE PGKVSNLQYLHS YLTYI KLSTA3 KRKEiCMAKGLQ 
RALLQOCPEDDSKRSPRPQDLIRLYDIILQNLVELLCLPGLSED 
KAFQKE3GLKTLVFKAYRCFFIA0SYVLVKKWSEALVLYDRVLK 
YANEVNSDAGAFKNSLKDLPDVG-EL3 TQVRSEKCS LOAAAI LDA 
NDAH0TETSSSOVKDNKPLVEKFETFCLDPSLVTKQA.VLVHFPP 
GFQP1 PCKPLFFDLALNKVAFPPLEDKLEC3KTKSGLVGY 3XGIF 
GFRS 


! 6629 
L 


5653 


4549 


GATF LG S VGGR TG KMDAATLT YDTLR FAEFEDFP ETSE P VW 3 LG 
RKYS 3 FTEK0E I LS DV ASRLWFTYRKNFPA3 GGTG PTSDTGWG C 
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beci nning 
nuc] c=ot ide 
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corresponding 
to first 
amino acid 
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sequence 
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nuc] eotide 
location 
corresponding 
to first j 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticc j 
(A^-Alanine , OCysieine, D^-Aspartic Acid, L : i 
Glutamic Acid, F= Phenyl a 1 anine , G=Glycine, i 
H=Histidine, I-Isci euci ne , K>Lysine, 
1,= Leucine. M = Meth; cnine , N=Asparagin*_ . i 
F = Prolir.e , 0=Glut£mine, R=Arginine, ) 
S-Serine, T- Threonine, VsValir.e, j 
w=?ryptophsn, Yt=Ty rosine , X-Unknovm, *=Stot 
Codon, /^possible nuclectide deletion, 
\=possible nucleotide insertion) 








KLRCG0M3KA0ALVCR}ILGRDMRWT0RkF0PDSYFSVLNArIDR 
KDSYYSIK0IA0MGVGEGKSTGOWYGPNTVAQVLKKLAVFDTWC 

SLAVHIAttDNTVVMEKJ RRLCRTSVPCAGATAFPADSDRHCNGF 
FAGAEVTNRPSPWRPLVLLIPLRLGLTDINEAYVETLKHCFMMP 
QS1XJV1 GGK PN S AHY F 1 G YVGEELl YbDPHTTQPAVEPTDGCFl 
PDES KHC0KPPC3MS 1 AELDPS 3 AVVRGGHLSTQAFG AECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGGIRRKSAWGAKPGRHVSRVRALYKRVLQLHRVLPPDLKS 
L^PQYVKXJEFRRHKTVGSDEAORFLQKWKVYATALLOC'/iNENRQ . 
NSTGKACFGTFl.PEEKLNDFRDEQJGOLOELKQEATKPNROFSl 
SBSMKPKF 


6631 


? 


423 


LVOCGGIRRRSAWGAMPGRHVSKVRALYXRVLOiuHRVliPPDLKS 
1XWYVKI)EFRRHKTVGSDEA0RFLQEWEWATALLCK)ANENRQ ; 
NSTGKACFGTFLPEEKLNDFRDEQ 3 GQLQELMQEATKFNRQFSI 
SESMKPKF 


6 632 


1273 


S68 


WNSRGRTQHGAA?LAPAAAMKAW0RVTRASVTVGGEC'3SAIGR 

Gl CVLLGISLEDTCKELEHM'^RKILNLRVFEDESGiCHWSKSVI'dD j 

KQYEILCVSOFTL0CVLKGNKPDFHLAMPTE0AEGFYTJSFLEQL 

RKTyRPELIKDGKrGAYMCVHIONPGPVTlELESPAPGTATSDp 

KOLSKLEKOOORKEKTRAKGPSESSKERNTFRKED^SASSGAEG 

DVSSEREP 


6633 






ATGRHEGVFTUiGIlOObVNGlITPATIPSlXSPWGVUHSNPMDY 
AWGANGLDAIlTOLLNOFENTGPPPADKEKlQAluPTVFVTEEHV 
GSGLECPVCKDDYALGERVRQLPCNHLFHEGC3VPWLE0HDSCP 
VCFKSLTG0NTATNPPGL7GVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGG1PRKGSGPRRRLPMJ\KLRDCLPRLMLTLRSLLFWSLVYCYC 
GLCASIHLLKLLWSLGKGPAOTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLKFHYVAAGERGKPL^LLLHGFPEFWYSWRYQLREF 
KSEYRWA1DLRGYGETDAPIHR0NYKLDCL1TDIKDILDSLGY 
SKCVLIGHDWGGMIAWLIA1CYPEWVMKLIVINFPHPNVFTEYI 
I.RHPAQLLKSSYYYFFC3PWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCOLTTEDLEAYIYVFSQPGALSGPINHYRN1FSCLPLKH 
HMVTl^PTLLl.WGENDAFMEVEMAEVTRrYVKNYFRLTjLSEASH 
WLQQDOPDIVNKLIWTFIjKEETRKKD 


663S ' 


| 1420 


470 


ENiRAGQOEASMIjRW TRAW K U>R EG LG PHGPS FARVP V AF S S S SG 
GRGGAEPRPLPLSYRIjbDGEAALPAVVFLHGLFGSKTNFNSIAK 
I LA0OTGRR VLTVDARNHGDS PHS FDMS YE IMSQDLQDLL P01K5 
LVPCWVGHSMGGKTAMLLALURPEbVERLIAVDISPVESTGVS 
HFATy^AAWRUiJWlADELPRSRARKLJU)EQl^SVIOD«MAVRC>HI, 
LT>3LVEVIXRF\A'IRVNIJ1Aj^T0HLEK1LAFPOR0ESYLGPTLFL 
LGGNSQFVHPSKKPE IMRLFPRAO.^QTVPNAGHWI 1IADR PQDFI 
AAIRGFXjV 


6636 


1S14 


1801 


S FCMFSHK0DS KF0AVPVOEKKKRLR RAP WRAFAQPOR EKH PAE 
OPIVRQCIjORPPLCGVIjGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDGTCVLDKAGEYKCACLAGYTGQRCENLL5AGKSK1 
KASEDSLSVLEERNCSDPGGPVNGYOKITGGPGL1NGRI1AXIGT 
WSFFCNNSYVLSGNEKRTCOQNGEWSGKQPICIKACREPKISD 
LVRRRVLPMOVOSRETPLHOLYSAAFSKQKLOSAPTKKPALPFG 
DLPMGYQHLHTOLOYECISFFYRRLGSSRRTCLRTGKWSGRAPS 
CI P1CGKI EN1TAPKTQGLRWPWQAAIYRRTSGVHDGSLHKGAN 
FLVCSGAX.VNERTVVVAAHCVTDLGKVTMIXTADLKVVLGKFYR 
DDDRDEKTI0SL0ISAI ILHPNYDPI LLDADIAI LKLLDKAR J S j 
TR VCP I CLAAS RDLSTS FC ESH I T VAGWNV1>ADVR S PG FKNDTL . 
RSGWSWDSLLCEEQHEDHGIPVSVTDNMr'CASWEPTAPSDIC | 
TAETGGIAAVSFPGRASPEPRWHLNK3LVSWSYDKTCSHR1.STAF i 
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SEC 
ID 

NO: 


Prec: c tec 
beginning 
nuc) cct ide 
local ion 
cor r L i ponding 
to first 
amine acic 
reBidue of 
amino ocic 
sequence 


Predicted end 
nucleotide 
location 
corresponds nc 
to first 
axino acid 
residue ot 
anino acid 
sequence 


Amino acid secment containing signal peptide 
(A=Alanir.e, C-Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F* Phenylalanine , G=Glycine, 
HrHistidine, I=lsoleucine , K=Lysme, 
Lr. Leucine, M=Methionine , N=Asparagint , 
P=Proline, Q=Glutamine , R=Arginirie, 
?=Ser:ne, T=Threonine, V=Val ine , 
W^Tryptophan, Y=Tyrosine, X^Unknovn, *=Stop 
Codon, /"-possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








TKVLPr KDWI ERNMK 


6638 


3351 




GG 1 POAGGKMAAPWWRAALCECRRWRGFSTSAVLGKRTPPLGPM 
PNSDIDLSNLERLEKTRSFDRYRRRAEQEAQAPHWWRTYREYFG 
E KTD P K E K 1 D 3 G LP P P KVS RTQOLLERKQA I CELRANVE EERAA 
RLRTASVPLDAVRAEWERTCGPYHKQRLAEYYGLYRDLFHGATF 
VPRVPLHVAYAVGEDDLMPVYCGNEVTPTEAAQAPEVTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 

fpfpargsgjhriafllfkqdqpidfsedarpspcyqlaqrtfr 
tkdfyk khcetmtpaglsffqcrwddsvtyi fhqlldmrepvfe 
fvrpppyhpkqxrfphrqplryldryrdsheptygiy 


s 04 6 


12 6 8 


IbLf J. MUiM.rlJiAjN Jul xMUvf v \.t. LjLJC rCK fVK rC^E. Crir^J^VKrvt^c, 

dpeecpeevydprslyerloeokdrkqoeyeeofkfknmvrgld 
edetnfldevsrqoeliekorreeelkelkeyrnnlkkvgisqe 
nkkevekkltvkpietknkfsoakllagavkhkssesgnsvkrl 

KPDPEPDDKN0EPSSCKSLGNTSI»SGPSIHCPSAAVC2GILPGL 
GAYSG^SDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6 64C 


117 


1043 


VLEFPDVSKAESEDRSLRIVLVGKTGSGKSATANT3LGEEIFDS 

n -r n »v X TTlrkT/~>/"W J\ r»f M/*\/"»T^T\1r t \T\tr\TT>T*1 rf^TVCt*! flTTPyr 

R J AAQ A V X KNLUKASKEV^C^kDIjLiVVUi Fv*l»r U l Kr.bJL*U i l LKk 
1SRCI I CSCPGPHAIVLVLLLCRYTEEEOKTVALIKAVFGKSAK 
Kl^MVI bFTKKEELEGOS FHDFI ADADVGLKS 1 VKSCGNRCCAFS 
NS KKTS KAE KESQVQELVE L 1 EKMVQCNEGAY FS DDI Y KDTEER 
LKOR E E VLRK I YTDQLNE E I KLVEED KH KSE E KKE K£ I KXLKLK 
YDKKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


664a 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
ASAPPPRGFSAISCTVEGAPASFGKSFAQKSGYFLCLSSLGSLE 
N PQENWAD1 0 1 WDKSPLPLGFSPV CDPMDS KAS VSKKKRMCV 
KL LP LG ATDTAVFDVRLSG KTKTV PG YLR I G DMGG FAI WCKKAK 

fit- HrvyKrH^ljoKi)rKJ^ljjl^uJ\^ 1 ftoMiwoKMo 
TLRRNDS 3 YEASSLYG3 SAMDGVPFTLHPRFEGKSCSPLAFSAF 
GDL.TI KS LAD IEEE YNYGFWEKTAAARLPPSVS 




22 


129& 


PLEERMMTKMDPNDQAORDIIFELRRIAFDAESDPSNAPGSGTE 
KR KAMYTKDY KMLGFTNH INPAMDFTQTPPGMLALDNMLYLAKV 
HODTY I R I VLENSSREDKHECPFGRS AI ELTKMLCE3 LQVGELP 
NEGRN D Y H PM F FTHDRAFE E LFG I C I QLLN KTW KEMRAT AED FN 
KVMQWREQ1 TRALPSKPNSLDQFKS KLRSLSYSEI LRLRQSER 
MSODD FOS P P I VELREK I QPEI LELI KOORLNR LCEG S S FR KI G 
NR R R QE R F W Y CR LALNHKVLH YGDLDDN PQG EVTFESLQE K I P V 
AD 3 KAI VTGKDCPHMKEKSALKQNKEVLELAFS I LYDPDETLNF 
I APNK Y EY C I W I DGLS ALLGKDMSS ELTKSDLD7LLSMEMKLRL 
LDLENIOI PEAPPPI PKEPSSYDFVYHYG 


6643 


3 04? 


22«5 


SLHAPAEGRTRGRLASKPKMLTRK3 KLWD2 NAH I TCRLCSG YLI 
DATTVTECLHTFCRSCLVKYLEENNTCPTCR3VIHOSHPLQYIG 
HDRTMODIVYKLVPGbOEAEMRKOREFYHKLGKEVPGDIKGETC 
SAKQHLDSHRWGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLECNSSKLRGLKRKWIRCSAOATVLHLKIO^lAKICiNLSSFNEL 
DI LCNEEI LGKBHTLKFVVVTRWRFKKAPLLLHYRPKWDLL 


6644 


1485 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEMIiY I LDQR 
LiRAQN I PGDKARKVLNDI I STMFNRKFMEELFKPQSLY SKXALR 
WYERUAHAS IMKLNQASMDKLYDLMTMAFKYQVLLCPRPKDVL 
LVTFNHLDT1 XGF1RDSPTI LOQVDETL»RQLTE I YGGLSAGEFQ 
LI RQTLLI FFODLHI RVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGLIRMFNNKGESVKRIEFKHGGNYVPAPKEGSrEFYGDRVL 
Kl^TNMYSVNQPVETHVSGSSKmASWTQESIAPNPLAKEELNF 
LARLMGGME I KKPSG PEPGFRLNLFTTDEEEEOAALTR PEELSY 
EVINI OATQDQORSEELARI MGEFE I TEOPRLSTS KGDDLLAMM 
DEL 
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-EC 
ID 
NO: 


Preci cteri 
been nm no 
p.uc-t-ot ide 
locstior. 
corresponding 
to firsl 
amine acid 
residue of 
amino acid 
sequence 


Predicted v.r.c. 1 

nucleotide 

1 oca z ion 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amine acid segment containing signal peptide 
(A=AJanine, C=Cysteine / D=A£partic Acid, e= 
Glutamic Acid, F=Phenyialanine , G=Glycine, 
H=Histidine, 1 = 1 soleucine , K=Lysine, 
L-Leucanc, M-Methionjne, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Vs 1 me , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6645 


( 53 0 


4646 


FVEGI^GyVYKAASEGKVJjTLAALLl>lRSESDIRYLl>GYVSQQG 
G0RSTPL1 Z AAR^HAKVVRLLLEHYRVOTOOTGTVRFDGYVID 
GATALWCAAGAGHFEWKbLVSHGANVNHTTVTNSTPLRAACFD 
GRLDIVKY LVENNAN I S I AN K Y DNTCLM ] AAY KGHTD WR YLLE 
ORADPNAKAHCGATALHFAAEAGH I DI VKELI KWRAAI WNGHG 
NT PL KVAAES CKADWELLLSKADCDR RS RI E ALELLGAS FAND 
RPKYDIIXTYHYLYLAMLFRFODGDNILEKEVLPPIHAYGNRO-E 
CKNPQELESIRQDRDALHMRGL1 VRER1LGADNIDVSHPI 1YRG 
AVYADNMEFEQCIKLWLHALHLRQXGNRNTHXDLLRFAQVFSQM 
IKLNETVKAPDIECVLRCSVLE1EQSMNRVKN I SDADVHNAMDN 
YEOTLYTFLYLVCI STKTOCSEEDOCK I NXQI YNL1 HLDPRTRE 
G FTLLHIiA VNSNTPVDDFHTNDVC55 FPNALVTKLLLDCGAEWA 
VDNEGNSALHIIVQYNRPISDFLTLHSIIISLVEAGAHTDMTNK 
0NKTPLDKSTTGVSE3 LLKTCMKMSLKCLAARAVRAKD INYQDQ 
1 PRTLEEFVGFH 


6646 


17( 


890 


PSSRMWLPEDMENALTGSOSSHASLRNlHSIWPTOLMARIESy 
EGREKKG3 SDVRRTFCLFVTFDLLFVTLLWI I ELNVtfGGIENTL. 
EKEVMOYDYYSSYFDI FLLAVFRFKVLJ LAYAVCRLRHWWAIAL 

TTIi VTQZXPT 1.MH7TT.QV1 F^P/ZIiPnYVT ,PT T Q^TTAUT FTWFT.D 
J ] Mv 1 onf Jb L>/\ /vv A i->o xV Li r oyvMru I vijriia« 1 ij^irv i ij. i n c LiLs 

FKVLPQKAEEENRLL1V0DASERAAL1PGGLSDG0FYSPPESEA 
GSEEAEEKODSEKPLLEL 


6647 


IK 


890 


pccRMI^HLPEDKENALTGSQSSHASIiRNlHSINPTOLMARIESY 

PfIP7*KKr;T QnVBPTPH ^"VTFnT.I.WrT.MiT T Ft>N\TNG(5T RNTT. 
L.\jr\Lj\i\<.7 j o u v /v f\ i rv. jjr v i r uijLir v i uun i * luih v i^w? a c*i» a i_» 

EKEVMQYDYYSSYFDI FLLAVFRFKVL I LAYAVCRLRHWWAIAL. 

TTAVTSAFLLAKVI LS KLFSQGAFG YVLF 3 1 S Fl LAW1 ETWFLD 

FKVLPQEAEEENKLL1 VQDASERAAL3 PGGLSDGQFYS PPESEA 

GSEEAEEKQDS EKPLLEL 


6648 


4i:- 


897 


RNCWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDYIEDDNPEL 
1 RPQKLI NPVKTSRNHQDLHRELLMNOKRGZjAPONKPEIjOKVME 
KRKRDQVI K0KESEAQKKKSDLE1 ELLKR00KLEQLELEKQKLO 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6645 


1357 


832 


W3 FRAAG1 RHEVKWDVKEI MSQHN J Y VDALLKEFEQFNRRLNEV 
SXRVR I PLPVSNILWEHCI RLANRT I VEGYANVKXCSKEGRALM 
OLDFQ0FXMKLEKLTD2 RP I PDKEFVETY I KAY YLTENDMERWI 
KFHP E YSTKQLTNLVNVCLGSH 3 NKKAKOKLLAA I DD I DR PKR 


66S0 


^ ' • 


765: 


LVPLVFSLLVQSCKQVYRS I AMKFVPCLLLVTLSCLGTLGQAPR 
OKCX3STGEEFHFQTGGRDSCTMRPSSLGOGAGEVWLRVDCRNTD 
OIYW CEYRG0PSMCOAFAADPKSYWNOALQELRRLHHAC0GAPV 
LR PS VCR EAG PQAHMQQ VTS S LKGS P E PNQQ P E AGT PSLR P KAT 
VKLTEATOLGKDSMEELGKAKPTTRPTAKPTOPGPRPGGKEEAX 
KKAWEHCWKPFQALCAFLI S FFRG 


6651 


3 4 2f- 


2 3 5? 


AXELLKVGDFSLCAGPYONTALrrMEWLSKEFLASFVSESFDISA 
CGI ATEHVKI DNSGEGLTAEAGSETLS RDGEVGVNSDMHYELSG 
DSDLDL1X5DCRNPRLDLEDSYTLRGSYTRKXDVPTDGYESSLNF 
HNNNOEDWGCSSWVPGKETSLPPGHWTAAVKKEEKCVPPYVQIR 
DLHG I LR'J YANFS I TKELKDTMRTSHGLR RH PS FSANCGLPSSW 
TSTWOVADDLTQNTLDLE YLR FAHKLKQTI KNGDSQHSAS S ANV 
FPKES PTQIS IGAFPSTKISEAPFLHPAPRSRSPLLVTVVESDP 
RPCGOPRJ^GYTASSLDSSSSWRERCSHNRDIJ^NSORI^rVSFHL 
NKLKYNSTVKESRNDISLI LNEYAEFNKVMKNSNQFI PQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDI 1 1 DVCTNLHVKLRSVVKEA 
C KST F LF Y LVETEDXSF FVRTKNLLR KGGHTE 1 E PQHFCQAFHR 
ENDTL1II IRNEDISSHLHOIPSLLKLKHFPSVIFAGVDSPGDV 
LDHT Y QEL FRAGG FV I SDDX I LE A VTL VQLKE 1 1 K I LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSF0SANI1ELLH 

yhck:dsrsstkaeilkc^ljjloiqhidarfavlltdkptiprev 
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r~szo— i 

ID 

NO: 


riedicrteo 
begi nning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino ncid 
secruence 


Prtcicteo end 
r.'cc i eot id< 
location 
correspond i nc 
to first 
amino acic 
rer.idut* ot 
cmino acic 
scouencf 


Ammo acid seomtr.t. containing siynal peptide 1 
(A^Alanine, oCysteirje, C-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, 1 = Isoletcine. , K=Lysine, 
L=Leucine, M= Methionine , N=Asparagine , 
P=Proline, OGIutamine , R-=Arginine, 
S-Serine, T^Thieonint , V=Valine, 
W= Tryptophan, Y^Tyrofine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possdblc nucleotide insertion) 








FENNGI LVTDVNKF1EN1 EK 1 AAPFKSSYW 


2 


234:- 


IPGSTISCSCHSRRLRGGSFAPRLSI^AASPRPRPPSLPLPLPL | 
PFPLFLPTR PAERAW 1 R S R RaS EWVGKMEVPRLDHALNS PTSP C 
EEV1KNLSLEAIQLCDRDGNKS0DSGIAEMEELPVPHNIK1SNI 
TCDS PK I S WEMDS KS KDR 1 Tl: V FI DLN KKENKNSNKFKKKDVPT 
KLVAKAVPLPMTVRGHWFLSPRTEYTVAVQTASKOVDGDYWSE 
WSEI 1EFCTADYSKVHUT0LLEKA5VIAGRMLKFSVFYRNQHKE 
YFDYVREHHGNAKOPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPODSPYGRYRFEIAAEKLFNPNTNLYFGDFYCMYTAYWYV ! 
ILVIAPVGSPGDEFCKQRLFOLNSKDNKPLTCTEEDGViVYHHA 
CDV I LEV 1YTD PVDLS LGTVAE I TGHQLMS LSTANAK KDPSCKT 
CNISVGR 


66S3 


170 




FFLEPRLRPFPASRARKVPARTRPSPLHPCCFCFEGGGSMLSPQ \ 

R VAAAASRGADDAMES S K PG PVO WLVOKDOH SFELDE KALAS I 

LLODHIRDLDWWSVAGAn^KGKSFILDFMLRYLYSOKESGHS 

NWLGnPEEPLTGFSWPGGSDFETTGlOIWSEVFTVEKPGGKKVA 

WLMDT0GAFDS0STVKDCAT1FALSTKTSSVQIYNLSQNIQED 

DLOOLOLFTEYGRLAMDEIFC'KPFOTLMPLVRDWSFPYEYSYGL 

0CGMAFLDKRLQVKF.H0HSE10KVRNH1HSCFSDVTCFLLPHPG 

LOVATSPDFDGKLKDIAGEFKEOLOALIPYVLNPSKLMEKEING 

SKVTCRGLbEYFKAYl KI YQGEDLPKPKSMLQATAEAYNIiAAAA 

SAKDI YYUNMEEVCGGEKPYLS VD1 bEEKHGEFKQlALDHFKKT 

KKMCGKDFS FRYQQELEEE1 KELYENFCKHNGSKNVFSTFRTPA 

VLFTGI VALY 1ASGLTGF3 GLE WAQLFNCMVGLLLIALLTWGY 

IRYSGQYREI^GGAIDFGAAYVL.EQASSHIGNSTQATVRDAWGR 

PSMDKKAQ 


6G54 


1 


70 5 


RTSLiSPSCCSSFNLiAMASAGMQI LGVVLTLLGWVNGLVSCALPM 
WKVTAFlG^SlVVAOVWEGLWMSCVVOSTGOMOCKVYDSUUAb . 

podloaaralcviallvalfgllvyiagakcttcveekdskarl i 
vltsgivfvisgvltlipvcwiahavirdfynplvaeaokrelg ! 
aslylgwaasgllllgggllcctcpsggsogpshymarystsap 
aisrgpseyptkwyv 


6 6r> r > 




1 f 


KDAYMPKKGLLALAL.VFSLPVFAAEHWIDVRVPEQYQOEHVOGA 
I N I P L KEVKER I ATAVPDKNDTV KVY CNAGRQSGQAKE J hS EMG 
YTHVEN AGG h KD I AM PKV KG 


6656 


2 


. ±2li 


1ELPPRPANUAIQPPLSPLRALAPLPEKPGAVPPP0KRMAKVAK 
DLKPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCS0EDHCLTSDLEDDRK1GSLLKDSKYSTLTARVKGGDGIRIY 
KRNRM3 MTNP2ATGKDPTFDTJ TYEV7APPGVTQKLGLOYMELIP 
KEKOPVTGTEGAFYRRROLMKOLPIYDODPSRCRGLLENELIOjM 
EEFVK0YKSEALGVGEVALPG0GGLPKEEGK0OEKPEGAETTAA 
TTNG5LSDPSKEVEYVCELCKGAAPPDSPWYSDRAGYNK0WHP 
Tl_ r Vv-AKCSfc. PI»VDIj I J rn KDGM Y WvvKniUbS5ijKFK^o^fV.Ufcl 
IFAEDYQRVEDLAHHRKHFVCEGCEOM^GRAYIVTKGQLLCPT 
CSKSKRS 


( 6657 


830 


212C 


LL.TCOER-AGDCLLSASTMKE^A^i'WSPKKVADWLLENA>IPEYCEP 
LEHFTGQDLINLTQEDFKKPPLGRVSSDNGQRLLDMIETLKMEH 
HLEAKKNGHANGHLNIGVD1 PTFDGS FS IKIKPNGMPNGYRKEM 
I K I PM PELERSQYPMEWC-KTFLAFLY ALSCFVLTTVM1 SWHER 
V P PKE VOPPLPDTFFDHFNR VQWRFS 1 CEINGMILVGLWL 1QVL 
bliKYKS 1 1 SRRFFCI VGTLYLYPCI TMYVTTLP VPGMHFNCS PK 
LFGDWEAOLRRI MKL1 AGGGLS I TGS HNMCGDYLYSGHTVMLTL 
TYLFJ KEYSPRRLWWYHWI CWLLS WGI FCILLAHDHYTVDVW 
AY Y I TTRL»FWWYHTMANQOVLK EASQMNLLAR VWW YR P FQ Y FEK 
NVQGJ VPRSYHWPFPWPWHLSROVKYSRLVNDT 


6658 | 3S 


85^ 


KCCAbG APGS PYRGLY FSS AAPCTAPRKAKHQSTLEG LTKRMLM 
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j ^ 

IT) 
NO: 


Predict i-c j ?reoicttrc enc 
becinnir-g 1 nucleo-t jot 
nucleotide 1 j oca tic:, 
locution ; corresponding 
ccr responding , tc firs", 
to first I r>mino zxi c 
arr.^no acid i residue o*. 
residue of '■ ^rtunc kk! 
ammo acid : seguer.cf 
sequence 


Amino acid segment containing signal peptide | 
(A=Alariine, C-Oysteine, P=Aspartic Acid, E = 
G3utanr.*c Acid, F-Fhenylalanme, G=Glycir.e , 1 
K- Hist idine, 1- Iso'j eucine , K-Lysine, i 
L=Leue:ne, M=Methaonine , N=Aspsragine , 1 
P^-Proline, Q=Glutanune , R=Arginine, 
S=Serine, T-Threonine, V=Va:ine, 
W-Tryptophan, Y=Tyrosine, X=Unknovn, *»Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 






FDPVpVKQEAMDPVSVSYPSNYMESMKPNKYGVI YSTPLPEKFF 
QTP EG bS HG 1 OME PVDLTVM KRS S P P S AGNS P SSLKF ?S SH RRA 
SPGLSMPSSSPFIKKYSPPSPGVQPFGVPLSMPPVI4AAA1»SRHG 
I RS PG 1 LP V 1 OP VWQP VPFM YTSHLQC PLMVSLSEEMENS S S S 
MQVPV1ESYEKP1SQKK1K1EPGIEPQRTDYYPEEMSPPLMNSV 
SPPOAI.LQE 


cess 


18 52 > 
1 
I 

! 
j 


EPOSGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL 

CGNPT 1 7CPHNRTLNNCHHSGVQV PL-MY CNL1TFS PON I SNCR Y 
AOTPAKMFYIVACONRDQRRDPPQYPWPVHLHTl 1 


6 660 


£14 1707 
1 
I 


CAASLDCRHHLCEPDMKLVWPSAKL.LOAAAGASARACDSVTSNV 
LPLLLECFHKHSOSS0RRT3LEMLLGFLKLO0KWSYEDKDQRPL 
NGFKDCLCSLVFKALTDPSTOLQbVGlRTLI^VbGAOPDLLSYED 
LELAVGI-'IiYRIjSFLKEDSOSCRVAAIjEASGTLAALYPVAFSSHL 
VPKLAEEURVGESNLTNGDEPTOCSRHLCCLQALSAVSTHPS I V 
KETLPLLLOHLWOVNRGNMVAOSSDVIAVCO^LROMAEKCOODP 
ESCWYFHOTAlPCLLAbAVOASMPEKEPSVLRKVLLEDEVLAAM 
VSV1GTATTHLSFELAA0SVTH1VPLFLDGNVSFLPENSFPSRF 
QP FQDGS SGQRRL 1 ALLMAFVCS LPRNVS EH I WEVLLFNLDK VT 
Kir 


e.661 


175* 4 3 0 


GVHAASGTbSATWl^EAKMFDSLAXAGKYLGOAAKl>MlGMPDYD 
NYVEHMRVNHPDQTPMTYEEFEKERQDAKYGGKGGARCC 


6662 


i 


RSLPKPAPAQPAS2HCARFSGVTPPTAKTAMSDGNTAFNALMYC , 
GPKADDGNIFSACAPASSAVKASVSVAOPGQAVIF 


6 6 63 


3 100< 

I 

i 
l 

j 


RPVLSSRVDDFVPPLPETSGRRKKLERMYSVDRVSDDIPIRTWF 
PKENLFS FOTAS TTM0AI SNFR KHLRrfVGSR R VKAOTFAERR ER 
croDcuciiD r, DwtwfinTcur>cDri<!cnmccucTi nFaFPni nwrix 

Jbr lr> KoWbUr . rn JvrUJ 1 onL'.sKlJoo ULAJSiynl*- 1 L>UCJ\T C*UL>UrlU 1 

EKGLEAVACDTEGFVPPKVMLISSKVPKAEYl PTI IRRDDPSI 3 
P I bY DH EHAT FED I LE E I ER KIjNVYH KG AK I W KML>3 FCQGGPGK 

lyllknkvatkakvekeedmikfwkrlsrlmskvnfepnvihim 
gcy i lgnpngeklfonbrtlmtpyrvtfesplelsaogkomi et 
yfdfrlyrlv:ksroksklldfodvl 


6664 


S8 | 96* 

j 

i 
i 

! 
I 

i 

! 


PRLLRl.PRSVWMDSPWDELALAFSRTSMFPFFDIAI-lYLVSVMA 
VKROPGAAAIiAWKNPl SSWFTAMLHCFGGGILSCLL1AEPPLKF 
LANKTN I LLAS S I WYI TFFCPKDLVSQGY^ YLPVOLLASGMKEV 
JRTWKI VGGVTHANSYYKNGWIVMIAIGWARGAGGT1 1TNFERL 
VKGDWKPEGDEWLKKSYPAKVTLLGSVIFTFQHT0HUA1SKHNL 
MFLYTI FIVATKI T74MTTQTSTMTFAPFEDTLSWMLFGW00PFS 
SCEKKS EAKS PSNG VGSLAS KPVDVASDNVKKKHTKKNE 


666b 


173 


1271 


BERRLACRQVVTQORSELYPGFOKRORFLFKAGEEAAAQGGRKb » 
PGRWLGPGCTONPCSVHTATGPEPRICLPLLPPDSPNSGYPKEPA | 
ALCPGI PSPCRMTHQDLS 1TAKLINGGVAGLVGVTCVFPI DLAK [ 
TRLQNQH G KAM Y KG M 1 DCLM KTARA EG FFGK YRG AAVNLTLVTP 
EKA3KLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVVVTC 
PKEKLK I ObQDAGR LAVKHOGSASAP STSUSYTTGSASTHRRPS 
ATLJAWELLRTQGIJH-LYRGLGATLLKDI FFS 1 1 YFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS I AAVAVTPLDVLKTRIQTLK 
KGLGEDMYSG I TDCAR 


f 66 66 


498 


2866 


MTTFLPVPOMMAGFSFGTFGNPPMESPSAWQTlHQPFIVSCbTb 
WSPGCWPOPICKEGVGLWD1RKPQSSLLRYGGNLSLOSAMSVRF 
NSNGT0L1ALRRR L PPVLYDI HSRLPVFQFDNOVYrNS CTM KSC 
CFAGDRDQYI LSGSDDFNLYMWRI PADPEAGGlGRWNGAFWVL 
KGHRS 1 VNQVRFN PHTYMI CSSGVEKI I KI WSP YKQPGCTGDLD 
GRI EDDSRCLYTHEEY I SLVLNSGSGLSHDYANQS VQEDPRMMA 
FFDSLVRREIEGWSSDSDSDbSESTILQLHAGVSERSGYTDSES 
S ASL PR S F FPTVD ES ADKA FHLGPLRVTTTNTVASTPPTPTCED 



537 
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ST.Q 
ID 
NO: 


Piedicted ] 
beginning | 
nucleocinf i 
location 
corresponding 
tc first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucl eot irie 
location 
corresponding 
to first 
ami no acid 
residue of 
amino acid 
sequence 


Amino ac:d secnent containing signal peptioc 
!A=Alanme, C=Cycteine, D=Aspartic Acid, t- 
Glutamic Acid, F«=Pbenylalanine, G-Glycine . 
H=Hiotidine, 1 = 1 sol cue j nc , K^Lysme, ; 
L^Leucine, M=Methior.i::e , N=Asparagi ne , 
Psproline, O^Glutamine , R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W= Tryptophan, Y=Tyrcsine, X=Unknovn, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 









AASRQQRLSAJbRi?YODKRLLA.LSNESr:SEENVCEVELDTDLFPR : 
PRSPS PEDES SSSSSSSSSEDEEELNERRASTWORNAWRRRQKT 
TREDKPSAPIKPTNTY3GEDN-7DYPQIKVDDLSSSPTSSPERST 
STliEIQPSRASPTSDI ESVERKI YKAYKWLRYSYISYSNNKDGE 
TS LVTGEADEGRAGTSHKDN P A P SS S KEACLN I AMAQRNODLP F 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSbETlCANHNNGRljHPRPPHPHNNGONIXSELEW 
AYSSPGHSD7DRDNSSLTGTLLHKDCCGSEMAC5TPNAGTRKDP 
TDT PATDS SRAVHGHSGLKRCR 1 ELEDTDSENS S S F.K K l.KT 


6 66 7 


173 


1310 


AEEVEKLAAMRSDSLVPGTHTPP1RRRSKFANLGR1FKPWKWRK 
KKSEKFKHT5AALERK3 SMSQSKEEZjIKKGVLKEI YDKDGELS3 
SNEEDSLENGOSLSSSQLSIiPAl^SEMEPVPMPRDPCSYEVLOpc 
DINDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQOGVAOUHHTVLPSQIQHQbOYGSHGQHLPSTTGSL 
FMHPSGCRM1 D ELN K TLAMTMC R L ES S EQR VP CS TS YHSSGLHS 
GDGVTKAGPMGLPEI R0VPTWI ECDDNKEN'VPHESDYE'DSSCL 
YTREEEEEEEDEDDDSSLYTSSIxAMKVCRKDSLAIKPSNRPSKR 
ELEEKN I LPR0TDEERLEI>RQ01 GTKL 


6668 


714 


35fi 


TLA VATG P ALT LR CI fVC TS S SN C K HS WCFA S S R FC KTTN TV E F 
LRGNLVKKDCAESCTPSYTLOGOVSSGTSSTQCCQEDLCNEKIiH 
NAAPTRTALAHiJALS LGLAL SLLA V I LAPSL 


~ 6669 


4SS 


1207 


KDEETRKDYrYMLDHPEEYYSHYYHyYSRJRLAPKVDVRWl LV£ 
VCA ISV FQFFS WWNS YWKAI S YLATVPXYRIQATE I AKQQGLLK 
KAKEKGKMKKSKEEIRDEEEN11KNIIKSKID3KGGYQKPQICC 
LLLFQI I LAFFHLCS Y 1 VWYCRW3 YNFNIKGKEYGZEERLYI I R 
KSMKMSKSQFDSLEDHQKETFLKRELW3KEMYEVYK0EQEEELK 
KKLANDPRWKRYRRW.vjKNEGPGRLTFVDD 


66'/0 


184 


S94 


VARJ'GEAAKMSSEFPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM 
PPGFYP P PGPHP PHG Y YP PGP YTPGPY PG PGGHTATVLV PSGAA 
TTVTV 


C6 7i 


1 


763 


bPAEKPRSAPNMAGG^CGPOLTALLAAWIAWAATAGPEEAALP 
PEQSRVOPMTASNWTLVMEGEWKLKFYAPWCPSCOOTDSEVIEAF 
AKWGEIUJI SVGXVDVIQEPGLSGRFFVTTbPAFFHAKDGl FRR 
YRGPGI FEDL0NY1 LEKKWQSVEPbTGWKSPASLTMSGMAGLFS 
1 5 GK I W } 1LHNY FTVTLG I PAW C S YV P FV I ATLV FG L.S MDL V L * V 
I SOCNWDPPYRHVS * /RPSTNLGVHTAHTSEHLPL 


6672 


304 


1083 


APGS KP VQ FMDFEG KTS FGMS VFN I SVh 1 MGSG3 LGLAY AKAHT 
GV1FFLALLLCTALLSSYSIHLLLTCAG1AGIRAYE0LGQRAFG 
PAGKVWATVICLHNVGAMSS YLFI I KSELPLVJGTFLYMDPEG 
DWFLKGNLLI IIVSVLI ILPLALMKHLGYLGYTSGLSLTCKLFF 

MFHS * LTG VLTQWP J MA FAFVCH PGGAG PS I TELCRA FQAQD 


6673 


1116 


1963 


LQIOTHHTHHGARVTHLGSHOL1^NAGTMLCRCX)SSSMAPAFSQ 
S VTCGP SPCVR KQESATKCLH3 GACGSDLWARG WEQG * G *GLNV 
WI.CPCVAFHRGARPOAEEGGARWJSLVSSPWIPPNP*HSSJGA£ 
NAVPRP*QG*KVNPSGOERQS\vrVXPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRPVPGTOWVPITW/PL 
1TWH*SAPTPPLKACPAPRPSDPCSSCLSCPCVT0KPRFSDTGW 
FGAGHCHSSCDFTRKGAAGGPG 


6674 


1 


4*0 


uEKDYMCQYDYVEVRDGDNRDGOHKRVCGNERPAPIQSIGSSL 
H^LFHSDGSXl^FDGFHAIYEEl'TACSSSPCFHDGTCVLDKAGSY j 
KCACLAGYTGORCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ j 
PTPC* 1GTRVAFFLT ! 
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PCT/USOO/34263 



SGQ 
ID 
NO: 


Predi ctec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

aw{nA -j 4 r"J 

sequence 


Predicted end 
nucleotide 
iocet a on 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Ammo ac:c fcarr.ent containing signal peptide 
{A-Alanine. r-Cysteine, D=A«jp<irtic Acid, F- 
Glutamic Acici, F= Phenyl al anine , G=Glycint, 
H=Histidim , I=Isoleucine, K=Lysine. 
L=l.eucine , M=Methionine , N-Asp^raginc , 
P=Proline. 0=Glutamine, R = Argimne, 
S=Serine, T^Threonme, V=Valine, 
Wi-Tryptophtn, Y«- Tyrosine, X*= Unknown, *=Stop 

\=possiblt nucleotide insertion) 


6675 


277 


2678 


GKWPTERr*lAFirNPTIlliAHIROSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSKPCLS3SEI0GSNGET0GYVYAQSVDITSSMDFGIR 
RRSNTAORLEf.l.KKERONOlKCKNIQWKEJiNSXQSACELKSLFE 
KKS1jKEKPP2 5 GKQS1LSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVVl.PLHSSODSLLPMTWTMAS/iRVOCl 1GLICWQ 
YTSEGREPKl.Nr?vrV£AYCLHIAEDDGEVDTDFPPLDSKEPIHKF 
G FS TLALVEK V 5 PGLTS KES LFVR r NAA>:G FSL I OVDNTKVTW 
KEIl.LKAVKRKKGSCKVSGSRADGV^EEDSOIDIATVODMLSSH 
HYKSFKVSM2HRLRFTTDV0L/GCALFPGVLRKRAAFVDCLRPS 
ADTWRQEQ3GCCGAACLAALjRS*DSHKC*EG 1SGDKVE1DPVTNC; 
KASTKFWI KOXP I S3DSDI>LCAC\DLAEE 


6676 


277 


1678 


GKWPTERMAFLDNPTI 3 LAHI RQS H VTSDPTGMCEMVLl DHDVD 
LEKI HPPSMPC DoGSElOGSNGETOGYVYAQSVDITSSWDFGIR 
RRSNTAQRLE?. 1 .R KE RQNQ I KCKN I QWKERMS KQS AO E L K S LFE 
KKSLOCEKPP I SGKOS 1 LSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKI DV VI PLHSSODRLLPMTWTMASARVpDLl GL1 CWQ 
YTSEGREPKLNLK^SAyCLHIAEDDGEVDTDFPPLDSNEPlHKF 
GFSTI^AEVEKVr.SPGLTSKESLFVRINAA>jGFSLlOVDNTKVTM 
KE1LLKAVKRJ.KGSCKVSGSRADGVFEEDS0IDIATV0DMLSSH 
HYKSFKVSM3W R F TTDVQL/ GCALFPG VLR KR AAP VDCLR PS 
ADTWRQEQ1GCCGAACAALRS* DSHKC* EG 3 SGDKVE2 DPVTNQ 
KASTKFWIKQKI 3SIDSDLLCAC\DLAEL 


6677 


277 


1676 


GNWFTERMAFLi.WPTl 2 1AHIRC»SHVTSDLTGMCF>1VJUIDHDVD 
LEK1HPPSMPGI SGSElCJGSNGETOGYVYAOSVDITrSWDFGIR 
RRSNTAQR1.ee 1 R KER0N0 3 KCKN I QWKEKKS KQS AUELKSLFE 
KKSLKEKPP15(.:<0S1LSVRLE0CP1>QLNNPFNEYSKFDGKGHV 
GTTATKKID^'L rLHSSODRLLPMTWTMASARVODLlGL.1 CWQ 
YTSEGREPKLMNVSAYCLHIAEDDGEVD7DFPPLDSNEP1HKF 
GFSTLALVEKY. f r;PGLTSKESLFVRINAAl]GFSL20VDNTKVTK 
KElLLKAVKKR.KGSOKVSGSRADGVFEEDSOlDIATVQDMliSSH 
HYKS FKVSM J KR 1,R FTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
AUrW K\JfcU-H->(- v- GAAC AAJjKS*IJi>iiK.U w fcGi oGJJft.v kJLD,r V 1 wy 
KASTKFWlKOKf J SIDSDl>LCAC\DbAEE 


6676 


222 


665 


GPSNQSSGSLr:, j VTGCSS YWS * INDTCH LRVLSSNFGRQ* LR 
PFPCSQLPMS CGCLWHLDCCCPWVPY I PGOCWRKGRORMRN * OS 

llgsdqesvgl?:dlcvfvnfluivllglfp*phelfllpwdlg 

FLFPLLLOGGCHCLVLPANLVSOAPOIGKl^CRLQTKDLEGSRN 
HHPLFLWGR Vi r.^V KH LET VQSGLAS 1>G FVGOHTSHGPP 


6679 


2 


786 


LEFARGAMFFLGCDWRSPGQNWVKTVDGWKRFLDEKSGSFVSDL 
SSY CN KEVYN K 1. KLFNSI.N YD/SCSQEEKEGKAE • CWONS \DFH 
OEKVf r YVHKG r 7 K ERHG YCTLGEAFNRLDFSTAI LDS RR FN YVV 
RILE L I AKSQL7S LSG I AOKNFMN 2 LEKWLKVLEDOON 3 TLI R 
ELLQ7LYTSLC7 1. VKR VGXS VLVGNI NMWVYRMETI LHWQQQLN 
NIQITRVSGQAC^PPGSGSLHRDTGQTRODFEFTPVTEESGLF 


6680 


1498 


2951 


PbCTLPLMPSAL.-GWAGERWEKOWPLA/PGPGTWQTPVGSISEE 
P\RKNE PDTH C 'P R GE AR PE V • HLPKPHS PGS EGAE J QTSA* AL? 
/NOVS PPQPM' CAEENGDCRGGKEEAGEELHRS SSGLTAAPGF? 
E VHRN LQTFPG L P SRGGG P/GGAGTQGSWAPGEQPP / S PLLPAS 
MORSOAGLPGKLAGLVESPTHHIPALRPSGTNATGEAFPSTTCS 
SGP\PAPPGPTGLRPGGGSSSGGHG* ♦ pglpvgkvXgalgaaqd 
POSOGRGPTGGTVGTEKLLSGLGSAKACPAARPAVP*LPSDPAS 
TI PKKGTRGFGIG PGVLOERNRWWGRAQGFTSADAAGTAPPGV 
♦ LPAPLSQPPGA2 EPOVRACGMAPPSPGTSGRLVANGRHPGPOV 
AOG CP PGAGCWG S OPRGSQR CPRTYTHSPLGHGRAPCPRP. CWH* 
WODPPSSPRTGCLPGIPARQAYSAPRTRSRPGIRTGRAAYGFIR 
FOGGGGG 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleoti de 
1 oc a t. i or* 
correspcndinc 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am^nc acid sconent containing i-icine.! peptide 
(A^Aianine , C-Cysteine, D=Aspart:ic Acid, E= 
Glutamic Acic, F»Phenylalanine , G=GJycine, 

V-J — W ■* el" ■» "i "\r> 1 — T i»\ \ r* i *~i o K-I.vci ^ c. 

li^beucine, M=Nethionine, N=Aspersg^nt , 

P=Prclane, Q=Gl ut amine, R^Argimne , 

S = SeriM. ^Threonine, V=Valin^. 

v;= Tryptophan, Y=Tyroeine, X=Unknovm, *=Stop 

Codon, / = possible nucleotide deletion, 

^possible nucleotide insertion' 


6681 


1169 


511 


1 n V 1 VVN000RAFH ELK\EKIiMSAPALGLPDLTK LKTLH VSERE 
KMTVG VLTOTVGPWS RPGAYJUSKOLDGVSKG W PPCPRALAATA1* 

LJ\\JE>f- \xJC/L> l L>Kyi\ljjvr< J\oJrrl/* \ v v i bin i ivvnM 
TLLCENPHKTIEVS NT/ LNPATLLLVTESPVKHNCLEVl.DSVYS 
S R PNLR DK P ♦ TS VD WELYVDGSGFANPCKVTI i K KETS F APVTPR 


6682 


105 


1238 


T VLCG AMQVS SLNE VK I YSLS CGKS LPEWLS D R KKRALQKXDVD 
VRRRIEL]ODFEMPTVCTTIKVSKDGQYlLATGTyKPRVRCYT>T 
YCLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRY1EFHSQSG 
FYYKTR1PKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENTWCD1 NSVHGLPATGTI EGRVECWDPRTRKRVGLL 
D\ AP * TVS003 OR * TS LPTISALKFN\GALTMAVGTTTGOVL.LY 
DLRSDKPLLVKDH0YGLPIKSVHFQDSLDLILSADSR1VKMWNK 
MSG K 1 FTSLEPEHDLNDVCLYPNSGMLLTAN E T P KMG 3 Y Y I PVL 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


10^ 


1236 


TVbCGPWVSSLNEVKlYSLSCGKSLPEWLSDRKKRALOKKDVD 
VRRRIEL1QDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 
Y QLS LKFER CLDS EWTFEI LSDDY SKI VFLKN DRYJE FHSQSG 
FY Y KTR I PKFGRDFS YHYPSCDLYFVGASSEV YKLNLEQGRYLN 
PWDAAENNVCDJNSVHGIjFATGTJEGRVECWDPRTRNRVGLL 
D\AP*TVSQOI0R*TSLPTISALKFN\GALT(4AVGTTTGOVLLY 
DLRSDKPLLVKDHOYGLPIKSVHFODSLDLII.RADSRIVKMWNK 
KSGKI FTSLEPEHDhNDVCLYPNSGMLLTANETPKMGl YYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6684 


113 




Gl.RGGT £ RGRAGRE P E F AAGVLCW AG FCQSPC P PG6RGR EA PA 
PP\SGR RHA* RPA* WLGGPGGDSGGREEGGS /GELORAWESKMG 
ELPLDINIOEPRWDOSTFUSRARHFFTVTDPRNLLLSGAOLEAS 
RN1VQNYR 


6605 




1473 


KLiLGDNFEGFCNKFELSDSSNGSNS*OSPL\FDKbFDPDPOKVL 
OG V I DM KNA V I GNNK 0KANL1 VLGA V PRbLY LLQOETS S TELKT 
ECA W LGS LAMGTENNV KSLXDCH 1 1 PALLOOLLS PDLKFIEAC 
LRCLRT1 FTSPVTPEELLYTDATVI PHLMALLS R S R YTQEY I CQ 
I FSHCCKGPDHQTI LFNHGAVQNJ AHLLTSLS YKVRMQALKCFS 
VXAFENPOVSMTLVNVljVDGELLPOI FVKMLORDKP I EMQLTSA 
KCLTYMCRAG Al RTDDNC I VLKTLPCLVRMCS K E RLLEER VEGA 
ETLAYL1 EPDVEL0R1ASITDHLIAMLADYFKV PSSVSA3 TD3 K 

l\JjUMlJljPvrtMnr.ljKyMMr J\.l-» 1 /^olA3MT» 1 >iL\J i kim\ v oiAir^n tr r v i» 

TASROGVTST 


6686 


3 3C 


927 


DSVTFDDl^VDFTPK^WTLLDPTORNLYRDVMLENYKNLATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSG1QMIGSHNGGEVSDVK0CGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFLTLKKKTSTGEORSVFSHVWKKPSSLNPDV 
VCQKNRCTRKXKAF* bQLTLGKSFH* SIHT 


6687 


lei 


915 


EAf-HiEAPYKKEEDEOORKEVKKDYPSNTTSSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGROCRHRSRSWDRRHGSESR 
SR DHRRE DRVW YRS PP1ATGEPVDNLS PEERDARTV FCMOLAAR 
I RPRDLEDFFEAVGKVP.DVR J I SDRNSRRSKG1 AYVEFCEIOS V 
PLAIGLTGQRLLGVP 1 3 VQASQAEKNRLAAMANNLO KG NGGPMR 
LYVGSLHFN1 TEDMLRGI FEPFGKV 


6668 


102i 


1 


AEVPNYPRVFHKCPDSCWRFKFQPIQIWYILLSFSSEKPPISF* 
S E PGLPR / S AT ARMATAAAP PNSS I DLP S DSGMG F I S p AGDS LP 
LPSDGGTGFPSLAGDSSSTRI>SSLAF1SFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRK 
VAVICGSKGAGA5GSASCSSRAGKTTEATAASSMPSGTSSFSTC 
TttSELEELFSLPSPAPLLSKLPTSSGSlAICCQDSGPSDTGRLS 
VCOLWIiADSDTGKLSDCQEWTVGDSGGLTCPEl- SLGRK * MSLIi_ 
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SEQ 
IP 
NO: 


Predictec 
beginning 
nucleotide 
1 ocatior 
corresponding 
to first, 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signaj peptide 1 
!A=Aknme, OCysteine, D=Aspartic Acid, E = 
Glutamic Acid, F=PhenyJ alanine , G=Giycine, 
K=Histidine, 1=1 sole.uc j rie , K=- Lysine, 
L=Leuc i ne, M=Methicnine . N=-Asparagine , 
P=Projine, Q=G1 utamine , R^Argini ne , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=po£5.ibjie nucleotide :nsertion) 








SSAVIP6YSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


64G 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STS£ 
AEDSTFR1CSPSVSDTSSDSSGSKDNVL1LFSKVS1*SCFSLSS 
FFSDS 1 ? FCFSSSSFCKR* FVS f KVSQNALLSSRLSNGPGGSSK 
CRNSLTARQLAMSL* ATKF * RNACNPNCLS SKKSAL* LSLNQR F 
GGSASKKPGN1SFKS0KCSALSVCCNFV1KPREVSVSSENYPAF 


6690 




442 


gtrgxk-j;tlgplcswox?wrrclsardgsrmlllllllgsgogp 
oovgagotfeylkrehslskptcxsvgtgssslwnlmgnamvmtq 
y 2 rltp2mqs kqgalwnrvpcflr dwelqvk fx i hgqgkknl\h 
gdgla1kytkdrmqp 


6691 


2E'; 


1401 


LKTET? t EKARR Y KDRPSQLNAV FQEQKKM 1 OAQES 1 TLEDVAV 
DFTWEEWQLLGAAQKDLYRDVMLEITCSNLVAVGYQASKPDALFK 

HEHDAFEK1 VHCSKSOFLLGONMDI FDLRGKSLKSNLTLVNQSK 
GYEIXKSVEFTGNGDSFI.HANHEKLHTAIKFPASOKLISTKSOP 
ISPKHOKTRKLEKHirVCSECGKAFIKKSKLTDHOVMHTGEKPHR 
CSLCEKAFSRKFMLTEHORTHTGEKPYECPECGKAFLKKSRLNI 
HQKTKTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 

/fiVTIFT riKTPT T AHHRPI ITFR 
/ oivur i \j *\ J v_Jj xm t\/t\r 1 1 x Jur\ 


T6"92 — 


111 


939 


WlKEGEI^LWERFCAoNllKAGP^PKHIAFlMDGNRRYAKKCQVE 
RQEGHSOGFNKLAETLRWCLNLG I LEVTVYAFSI ENFKRSKSEV 
DGLMDL>\ROKFSRI^iEEKEKLOKHGVCIRVLGDLHLLPLDLOEL 
1 P.QAVGATKNYNKCFLNVCFAYTS RHEI SNAVREMAWGVEQGLL 
DPSDISFSLLDKCLYTNRSPHPD3 LIRTSGEVRLSDFLLWQTSH 
SCLVFCPVLWPEYTFWNLFEAILOFOMNHSVLQK 


6693 


111 


939 


W I KEGELS LWERFCAN1 1 KAGPKPKHIAPIMDGNRRYAKKCQVE 
RQEGHSCKjFNKLAETLRWCLNLG ] LEVTVYAFSI ENFKRSKSEV 
DGIWDLAROKFSRLMEEKEKLOKHGVCIRVLGDLHLLPLDLOEL 
IAOAVOATKNYNKCFLNVCFAVJ.SKHETSNAVREWAWGVEOGLL 
DPSD1 SESLLDKCLYTNRSPHPD] LI RTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFKNLFEAILOFOMNHSVLCK 


6694 


2 92 


813 


SLLLHLAPPGAYTPSOPLSSVS^KTASSVRROAAESROHELPVR 
EVHSLGQ1 LPQDGLTAEAGPPEAODPWGSPG 3 SLPAAH1GFAAA 
LAVG PSGCHTEP \ FDE VW PSLFL GDA Y AARJDKS KL 1 QUG I TH W 
N AAAG K FO VDTGAK F YRG MS LE Y Y G I E ADDN P F FDLS VY FLP 


6695 


29;: 


813 


SLLLHLAFPGAYTPSQPLSSVSTFTASSVRROAAESRQKELPVR 
EVHSLGQ1 LP0DGLTAEAGPPEAODPWGSPG1 SLPAAH1GFAAA 
LAVGPSGCHTEP\ FDEVWPSLFLGDAYAARDKSKL1 OLGITHW 
NAAAG K FO VDTGAKFYRGMSLE YYG I EADDNPFFDLSVYFLP 


6 6 96 


4 


782 


PRVRGRVGERWAFLSVPAAMSSEMEPLLLAWSYFRRRKFOLCAD 
LCTQMLE K S PYDQAAW I LKARA LTEM VY IDE1 DVDQEG IAEMML 
DENAI AO V PRPG7SLKLPGTNQTGGPSQAVR P 1 TQAGR P ITGFL 
R PSTQSG R PGTKEQA I RTPRTAYTAR P 1 TSSSGRFVRLGTASML 
TSPDGPFlNLSRLNLTKYSOKPKloAKALlEYJFHHENDVKTALD 
LAALSTEHSOYKDWWWK/DQIEKCYYRVGMYREAJSKOIKSS 


6697 


3 


782 


PPLFLRR LNSRALR PGSRKVMAWFAS hSGQWGS FAYLTI KDR 
I POILTK V I DTLHRHKSEFFEKHGE EG VEAEKKAI S LLSKLRNE 
LOTDKPF1PLVEKFVDTDIWNOYLEYOOSLLNESDGKSRWFYSP 
WbLV\EC YMYRRIHEAI \ 1QSPP 3 DYFDVFKESKEQNFYGSQES 
I IALCTKLQQLIRTI E DLD \ ENC L KDE FFKLLQI SLWGEISVDL 
SL\SGGESS SQNTNVLNS LEDLK PF I LLNDKEHLWSLLSNCK 


6698 


666 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP/PARRVLPRAMTASAOPRGRRPGVGVGVWTSCKHPRCV 
LLGKRKGSVGAGSFOLPGGHLEFGETWEECAOHETV/EEAALHLK 
N\THFASVVNSFlEK£NYHYrVTIL v .KGEVBVTr^ 
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SEC j 


Predicted 


Predicted enc I Amine acid segir.ent containing eternal peptide 


ID 


bee inn j ng 


nucleotide ' 


(A»TO onir.e, C=Cyrteine, D=A*jpartjc Acid, E= 


NO: 


nucleotide 


location 1 Glutamic Acid, F -Phenyl alanine , G=Glyc:ne, 


1 


iocf-ticn 


correspond! no 


H=Kistidine, 1= j soleucine, K-Lys:ne, 




con eseondinc 


to first 


L=Leucine, M= t v )et hionine , N=Asp£iagine, 




to first 


amine acid | P=Proline, Q=Glutamine, R=Arginme, 




amino acid 


residue of 


S=Serine, T-Threonine, V=Valint, 




residue of. 


amino acid 


W=Tryptophan, Y-Tyrosine, X=Unknovm, *-Stop 




amino scid 


sequence 1 Codon, /=possible nucleotide deletion. 




sequence 




\-possible nucleotide insertion' 






i ESKR I I YNHAFFFQES XWSGGI LQ 


6 700 


lC9fc 


1392 


TQCW RS STPGMRTHFR7QP / RLECGQG FSQQENGHCMD1 N EC1Q 
FPFVCPKDKPVCVNTYGSYRCRTKKKCSRGYEPI-JEDGTACVERT 
LLLGbCNLLGK 


6701 




148b 


AAAGPRTRVRRAAAFEC-OPSPSPGLGPTSDKAAAPRTPKRRRLW 
RORO/HPA^LCYVTRPDAVLMEVEVEAXANGEDCLNOVCRRLGI 
I EVDY FGL0FTGSKGESLWLNLRNR3 SQOMDGLAPYRLKLRVKF 
FVEPHLI LQEQTRH 1 F FjUH 1 XEALLAGHLLCS PF.QAVE LSALLA 
OTKFGDYNONTAKYTTi'EELCAKELSSATIiNSlVAICHKELEGTSQ 
ASAEYOVIiQI VSAMENYGI EWHSVRDSEGQKLL j GVGPEG1 SIC 
KDDFSP1NRIAYPW0NATOSGKNVYLTVTKESGNSIVLLFKMI 






( STRAASGXjYT^lTETHAFYRCDTVTSAVMMC'YSRDliKGHLASbF 








LNEN1 NLGKKYVFDI KRTSKEVYDP^iRRALYNAG WDLVSRNNO 








S PSHS PLKSSeSSMNCSSCEGLSCOOTRVLOEKLRKLKEAMLCM 








VCCEEEINSTFCPCGHTVCCESCAAOLOVGESAAHFCLOPHLSI> 








LLTGSRSQVLAR 


6 702 


397 


1971 


PLAKFLiODLVNVLCLPKEDVFLFYRTCFCSMGLGSSCHLSLPK 
R AEAL1.CSRKATWRDLVAVRMAEE0E FTOtCKLPAQPS H PHCV 
NNTYRSAQHSQAXLRGLLALRDSG1 LFDWLWEGRHI EAHRIL 
LAASCDYFKGMFAGGLKEMEQEEVL1 HGVS YNAMCQI IMF I YTS 
ELELSLSNVQETLVAAC0L0IPEI1HFCCDFLMSWVDEENILDV 
YRU\ELFDLSRLTEQLDTY3l>KNFVAF$RTDKYRQLPbEKVYSl> 
LS S NRbEV S CETEV YEG Al^LYHYS LEQ VQ ADO J S L>H E P F K LLET 
VRFPLMI^EVU)RLHDKLDPSPLRnTVASAI,MYHRNESLOPSLQ 
SPQTEbRSDFOCVVGFGGIHSTPSXMSSATRPK'ybNPLLGEWKH 
FTASIAPRMSNOGIAVLNNFVYLIGGDNNVOGFRAESRCWRYDP 
RHNRWFOI OSL0QEHADLS VC WGRY 1 YAVAGR DYHNDLNAVER 
Y DPATNSW AYVAPLXREVYAH AGATLEGKM Y I TCGRKGR 1 T 


6703 


4b 


124 4 


G VGPRAAAMPLELELCPGRWVGGQHPCF 3 J AE 1 GONHQGDLDVA 
KRM 3 RMAXECGADCAK FOKS ELEFXFNR KALER F YTS KHSWGKT 
YGEHKRHLEFSHDOYREbORYAEEVGlFFTASGMDEMAVEFLRE 
LNVPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVOLNDB 
TS S WDVIX5 R VRTS KE KV LMVL VLD Y S G R PMV I S S GMQS MDTMKQ 
VYQ3VKPLNPNFCFLQCTSAYPLQPEDVNLRV3SEYQKLFPD3P 
]GYSGHETG3A1SVAAVA1X5AKVLERK3TLDKTWKGSDHSASLE 
PGElJ^ELVRSVRLVERAl^SPTKOLLPCEhlACNEKLGKSVVAKV 
X 3 P EGT I LTMDMLTV KVGE P KG Y P P ED 1 FN LVG K KV1/VT V EEDD 
TIMSE 


6704 


82 


1007 


•JMNTRNRWNSGLGASPASRPTRDPODPSGROGELSPVEDOREG 
LEAAPKGPSRESWHAGCRRTSAYTL3APN3NRRKB3QRIAEQE 
I^LEKWXEQNRAXPVHbVFRRl^GSOSETEVHQKC^bObMQSK 
YKOKbKREESVRIKKEAEEAEIiQXMKAlQREKSNKLEEKKRLQE 
KLRRFJVFREHG^YKTAEFTa/ROTEHRIAJlOKCLSKCCLWPTILN 
MGQ K LGliQ \ DS LKAE ENR KLQKM KD EQHQKS ELLELKRQQQEQE 
F^KIHOTEHRRVNNAFLDRl^GKSOPGGLEOSGGCWNMNSGNSW 
Gl 


6705 




786 


RLCRNSARVPCGWSASRSLGEGAGFlGPbRGPHPRAGGTGTSFt 
SYKJ?KGG3MSTIAAJ^GGKS3L2TVATGFLGKELMEKLFRTSPD 
LKVI YI LVRPKAGQTbOKRVFQILDSKLFEKVI EVRPNVHEK3R 
A3 YADLNQNDFAISKEDMQELI^CTNI IFHCAATVRFDDTLRHA 
VOLNVTATRQLLLMASOflP KLEAF 3 H 3 STA YS W CNLKH I DEV I Y 
PCPVEPKK3IDSliEW\LDDAl JDE3TPKL3RDWPNIYTYTK 


6706 


330 


$31 


FTHSSSSHSQEMt>GKbNKLRNIX5HFCDIT3RVODKlFRA>IKVVL 
AACS DFFTRTKXVGQAEDENKNVLDLHHVT^nTG F 1 PLLE YAYTAT 
LS 3 NTEN 1 1 DVLAAAS YMQMFSVAS TCSEFMKS S I LWNTPNSQP 

EK 
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NC: 


Predicted 
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nucleot i dt 
location 
correspond J.ng 
to first 
amino aci: 
residue c: 
amino aci- 
sequence 


Predictec end 
nucleotide 
loca ti or: 
ccr respond i no 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Anino ecid segment containing signal peptide | 
(A»Alanine, C=Cysteine, D^Aspartic Acid, E- j 
Glutamic Acid, F-Phcnylal anine, G-Glycine, 
K=Hi stidine, 1 = 1 sol ?ucine , K=Lysint, 
L=Leucine, M-Met hionine, N=Asparagint , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 


6707 


2 23:- 


134 3 


Y WSGI GYELOHFHWRKFHfEKKGPPSTCQBRl.y ESKS RWPCIS » 
GMWVGKTAVNGSK*GG01»RCVCVCTSHSSDSTRSS0RASKCKS 
FFILSQ*KT+SSWENWVFAKYSRI YSYGHSCSKGRGD* DFK*NV 
SOAR* SRFCGLCNPCGHCGLDJNLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYOGAVCKEACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYI0HFHCDSN1LC1LYNISFNLFSYSF*GVARYA 
C*RCHWYFEWLLYNHCGDILVACL*RRQL*SSQ 


6709 

j 
» 


lib 


172S 


tvgswsrsgrsppvgrollligrgaoaagspogg^bcvelvpt 
geiirwhphrpcklalgsdgvrvtmesaltardrvgv0dfv1l 
enftseaaf:enlrrrfrenliytyigpvlvsvnpyrdt.qiysr 
ohmeryrgvsfyeepphllavadtvyralrterrdaa\-'misves 
gagktdatkrllqlyaetcpaporggavrdrllosnpvleafgn 
aktlrndnssrfgkymdvqfdfkga?vggkllsylleksrv\mq 
khgernfhify0llegceeetlrrlglernpqsyly1,vkgqcak 
vss tndksdwkwrkaltvi dftedevedllsiaasvlhlgn1h 

FAANEESNAQVTTEKQLK YLTRLLSVEGSTLREALTHR K 1 1 AKG 
E ELLS PLNLEQA^YARDAJ-»AXAVYSRTFTKLVGK INKS LAS KDV 
ESPSWRSTTVLGLLCIYGFEVFOHNSFEaFClNYCNhKLQQLFJ 
ELTLKSEOEEYEAEG JAWEPVQYFNNKIICDLVEEKFKGI l\SI 
LDE\ECLRPCE 


6709 

i 


3 


8S4 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVSKR5RKEEEDLEALIAHFQTLDAKRT0TVELPCPP 
PSPRLNASLSVHPEKDELILFGGEYFNGOKTFLYNELYVYNIRK 
DTWTK VDI PSPP PRRCAHQAWVPOGGGQLWVFGGL FAS PNGEQ 
FYHYKDLWVLHLATKTWE0VKSTGGPSGRSGHRMVAWKRQL1LF 
GGFHESTRDY I Y YNDVY AFNLDTFTWSKLSPSGTGPT FR SGCQ\ 
1 PSLPRAASSVYGGYSKORVKKDVDKGTRHSDMF 


6710 


158 


96C- 


RHKMTNYRVESSSGRAARKMRLALMGPAFlAAIGYa DPGNFATN 
I QAGAS FG YQLLMWVWANLMAML3 QI LSAXLGI A7 G KNLA2QI i 
RDHYPRPWWFYWVQAE1 I AMATDLAEFIGAAIGFKLI LGVSLL j 
0GAVLTG1 ATFLI LMLORRGQKPLEKVIGGLLLFV.AAAY I VSLI 
FSOPNLAOLGKGMVIPSLPTSEAVFLAAGVL\GATI *PHVI/YI 
WHSSLTQHLHGGSRQORYSATKWDVAI AMTIAGP VN1.A J WATAA ' 
SELNFYGHTGVA J 


6711 


3 


34'/ 


VTECXTMTCKMSQLERNl ♦TMINTLHHYSVKLGHPDTL1HGEFK ! 
ELVRTDLHN1LM:<ENKND0A1*H1MEDLDTNAH«01 3 FKELIMb ! 
KAMLTWSYliDNMlIDADYGPGOOHRPG > 


6712 


118 


57t 


PHGQKRTRYPOVRAPGOOPCAOLAMALCLKQVFAKDKI'FRPRKR | 
FEPGTQRFELYKKAOASLKSGLDLRSWRLPPGEN3 D'JWIAVHV i 
VDFFNR I NLI YGTMAERCS* TSCPVMAGGPRYEYRWCOERQYRR j 
PAKLSAPR YMALLMDWI ES L ] 


6713 

I 

t 
i 

i 

1 

j 

i 

1 

i 


2485 


3 


QARGSDSEDGEFE1QAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFOSMGLSYPVFKGIMKKGYKVPTPI 
0RKTIPVILDGKDWAMARTGSGKTACFLLPMFERLK7HSAQTG 
ARALI LSPTRELALQTLK FTKELGKFTG LKTALI LGG DRMEDQF 
AALHENPDZ 1 2ATPGRLVH VA VEMSLKLQS VEYWKDEADRLFE 
MGFAKQLOE 1 1 AR I .PGGKQ7VLFSATLP KLLVEFARAC-LTEPVL 
IRLDVDTKLKEOLKTSFFLVREDTKAAVLLHLLHW^RFODQTV 
VFVATKHHAE YLTELLTTQR VSCAH 1 YSALDPTARX 1 NLAKFTL 
GKCSTL I VTDLAARGLDI PLLDNVIKYS FPAKGKLFLKR VGRVA 
RAGRSGTAYSLVAPDEIPY1XDLHLF1X3RSLTLARPLKEPSGVA 
GVIX3M1^RVPOSWDEEI>SGLOSTLEASLSUIGU^V.2JDNAQOQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELORLRL 
VDSI KNYRSRATl FE1NASSRDLCSQVMRAKRQKDRKAI ARFQQ 
GQQGRQEQOEGPVGPAPSRPALOEKQPEKEEEEEAGESVEDlFS 
E WG R KR GRSGPNRGAKR R R E EARQRDQEFYI P YR P K S FDS ERG 
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amino acic 
sequence 


Predicted end - 
nuci eotide 
location 
corr e spend a ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acic ffement containing signal peptide 
^Alanine , C=Cysteine, D^Aspai: tic Acid, E*- 
Glutamic Acid, F ^Phenylalanine, G=Glycirjt, 
H=Histidme, 3 - \ yoleucane, K=Lysine, 
L=Leucine, :^=Kethionine , N=Asparagine . 
P=Proline, Q=G1 ut amine, RrArginine, 
S^Serine, T=7h*eonine, V= Valine, 
M=Tryptophar», Y~Tyrosine, X-Unknown, *-Stop 
Codon, / "possible nucleotide deletion, 
\«po9sible nucleotide insertion) 








LS I SGEGGAFHOQAAGA^/LDLMGCEAONLTRGROObKWDRKKKR 
FVGOSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 

grrrgiltrxrprtkevgearplaoagcipgphaprkplqaesa 
lelktkqqjlkorrraokaalslqrwwpoaalcpo 


6714 


169 


1416 


nncqellppppapmah '. psggapaagaapmgpoycvckvelsvs 
gqnlldrdvtsksdpfcvlftejwgrwieydrtetainnlnpaf 

SXKFVLDYHFEEV0KLKFALPD0DKSSMRLDEHDFLGOFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLITIAAQELSDNRVITLSLAG 
RRLDKKDLFGKSDPFL-EFYKPGDDGKWMLVKRTEVIKYTLDPVW 
KPFTVPLVSLCDGDKEKPIOVMCYDYDNDGGHDFIGEFOTSVSO 
MCEARDSVPLEFEC3NPKXQRKXXNYKNSG13ILRSCK1NRDYS 
FLDY1LGGC0LMFTVG1DFTASKGNPLDPSSLHYINPMGTNEYL 
S A I WAVG03 3 ODYDSDKM FPALG FG AQLPPDWKVSHEFA I NFNP 
TNPFCSGVDGIAQAYSACLP 


fc-ns 


32 


4 93 


G P AG AESGS LH C LP ATV Q ALAGAAK S PHGGQP PR R G PL 1 GSGMP 
G KPKHLGVPNG RMVLAVS DGELSSTTGPQGQGEGRGSSLS I HSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPIXLAYFTEFLKKEFS 
AENVTFWKACER FQQ1 PAS DT 




1 


176 


GAGGPAPRS FGCEEPKAALERDKMS ARAAAAKSTAXEETAl WEQ 
HTVTLHRVSLCCSK 


6717 


US 


8 96 


lFAMSGFENLNTDFYOTSYSIDDOSOOSYDYGGSGGPYSKOYAG 
YDYSOOGRFVPPDMMQPQQPYTGOJYQPTQAYTPASPOPFYGNN 
FEDEPPLLEFXGI NFDK 1 WQKTLTVLHPLKVADGSI MNETDLAG 
PMVFCLAFGATLLIAGKlQFGYVYGI SAIGCLGMFCLLNLMSMT 
GVSFGCVASVLGYCLLPKILLSSFAVIFSLQGMVGI ILTAGJ 1G 
WCS FSASK1 Fl SALAKEGOOLLVAYPCALLYGVFALI SVP 


29C 




KQSSTV PGT 3 LP S LKWHN SGLCKFP ETGGKMTTF KEGLTFKDVA 
V3 FTEEELGLIjDPVORNLYODVMLEN frnllsvghhpfkhdvfl 
LEKEKKLDIMKTATO 


6719 


1 


691 


ptrpeeqdredgkchkwemnpisgnlncdpiamsqcssdhgcet 
dldsdddkiekp.nnfkkdsasqdnglsrkisrkrvcssdsdssl 
owk kss kar tg llr i tk rcaataan k i klms dvedvslenvht 
rskngrkkplhlacttakkklsdcegsvhcevpseoyacegkpp 
dpdsegstkvxs0alngdsdsedmlksehkhrktnihk1dapsk 
rksssvtssc- 


6 72 0 


3 


• .B22 


HEVAEEAGGTVyPORGT^PGTKRFCHVIETPEPGKWELTGYEAA 
VP J TEKSNPLTCDLDKADAEN I VRLLGQCDAEI FQEEGCALSTY 
QRLYSES I LTTKVOVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMS V S FNQLM KG LGQ KPLY TY LI AGG D RS WAS R EGTEDS ALKG 
I EEL KKVAAG KKR VI VI G I S VGLS AP FVAGQMDCCMNNTAVFLP 
VLVG FNP VSMAR H P F F ? PR I LRS LT V F?5 LRAPH Y 0 1 TS LI j FSM 
SWTL1SE 


6 7 21 


3 


822 


H F V AF F AfiGTV Y P ORG W PGT kt? FOHV1ETPEPGKWELTGYFAA 
VP I TEKSNPLTODLDKADAENI VRLLGQCDAEI FQEEGOALSTY 
QRLYSES I LTTHVOVAGKVQEVLKEFDGGLVVLSGGGTSGRMAF 
LMSVS FNQLM KG LGQKPLY TYLI AGG DRSWASREGTEDSALHG 
I EELKKVAAGKKRVI V 1 G I SVGLSAP FVAGQMDCCMNNTAVFLP 
VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6722 


I 


390 


RSWSKRTWQALPMAVLFLLLFLCGTPQAAJDHMQAIYVALGEAVE 
LPCPSPSTLHGDEKLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNTSLWLEGSKEEDAGRYWCAVLGQHHNYQNW 


6723 


173 


6 59 


VCQYCTARMADFG I SAGQFVAVVWDKSSPVEALKGLVDKLQALT 
GNEGRVSVEN1 KQLLQSAHKESSFDI I LSGLVPGS TTLHSAE 3 L 
AEIAR1LRPGGCLFLXEPVETAVDNNSKVKTASKLCSALTLSGL 
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Ammo scic segment containing signal peptide 
(A-Altfr.j ne , C-Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, F- Phenylal anine , G-Glycmt-.. . 
H-HiFtiemt, 3 -3 sol eucine, K=Lysiri€. , 
^Leucine, ^Methionine, N^Asparac ine . 
p= Proline, C=Glu tainine, R=Ar<jinine, 
S=Serine, T=Threonine , V= Valine. 
W=Tryptophan, Y«Tyrcsine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 






Vh VAc.LyKt.PljI PoLVUoVhbHLGHEi: DNL 


6724 


173 655 


VCQY CTAR MAD KG 3 SAGQFVAWWDKSSPVEAbXGLVDKLCALT 
GN EGRVSV EN I KQLLQ? AHKE SSFDI3 bSCLV PGSTT LHSAE3L 
At 3 ARILR PGGCLFLK EPVETAVDNNS KVKTASKLCS ALTLSGL 
VEVXELQREPLTPEEVQSVREHLGHESDNL 


6725 


356 722 

i 

< 


RRRTPPVlIiAT«DDDLMLALRLOEEWNL0EAERbHA0ESLSl>VD 
AS W ELVDF 7 PDLQALF VOFNDQF FWGCLKAV EVKViS VR MTLCAG 
3CSYEGKGGMCS3RLSEPLLKLRPRKDLVEVFFV 


6726 


98 , 71< 

j 


HJ,0KMERK1NRSEKEKEY2GKKNSLED7DWKNCKSTLMTIJWG 
G Y LY 1 TOKCTLTKY PDTFLEG 3 VNGK 3 LCP FDADGW Fl DRDGb 
LFIUJVl^FbRNGEbLLPEGFRENOblACEAtFFObKGlAEEVKS 
R W KKEQLTF R ETTFLE I TDNHDRSQG LR 3 FCNAPDFIS K 1 KSR 1 
VLVSKSRLDGFPEEFS3SSNI 3QFXYFIK 


6727 


1 


831 


FRGMGDERFHYYGKKGTPQKYDPTFKGP3 YNRGCTD3 ICCVFLb 
IJMVGYVAVG31AWTHGDPRKV3YPTDSRGEFCGOKGTKNENKP 

ylfyfn3vkcasplvllefqcftpqicvekcpdryltylnarss 
rdfeyykcfcvpgfknnkgvaevbrdgdcpavl3 pskplarrcf 
pa3 kaykg vlmvgnettyedghgsrkn3 tdbvegakkangvlea 
rqlamr3fedytvswywd1 3slg1amaksllf1 illrflag1mg 
rgm:img3lvlgy 


6720 


48^ 


93S 


FCSSWLRSLAOSSLSWKMFLVGLTGGIASGKSSV10VFQOLGCA 
V3 DVDVMARR W0PGYPAKRR 3 VEVFGTEVLLENGDINR K VbGD 
blFNOPDRROLL>3AITHPE3RKF.MMKETFKYFbREPRTSPRGKK 
H VPSAI,KEADS LMR RDT 


6729 


25? 


ai9i 


VGLTGAQSGRTASMGRDCRAVAGPALRRWLLLGTVTVGFLACSV 
LAGVKKFDVPCGGRDCSGGC0CYPEKGGRG0PGPVGPOGYNGFP 
GLQGF PGLOGRKGDKGEKGAPGVTGP KGDVGARGVSGFPGADGI 
PGHPGOGGFRGRPGYDGCNGTOGDSGPOGPPGSEGFTCPPGPOG 
PKGQKGEPYALPXEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GPVGAPGRr-GPPGPPGPKGOQGNRGbGFYGVKGEXGDVGOPGPN 
G3 PSDTLHP3 l/'.PTGVTFHPDQYKGEKGSEGEPG I RGISLKGEE 
G3K 


6730 


784 


101S 


NMVDYYEVLGLCRYASPED1KKAYHKVALKWHPDKNPENKEEAE 
RKFKEVA3AYEVLSNDEKRP3 YDKYGTEGLNEF 


6731 


1 


446 


G3RKRbHGAVVPPVEVGCPWETRESEGVKLERPTSPLKN'NDEGS 
LD3 YAGLPS7\ VSDS ASKS. CVPSRNCLDLYEE I LTEEGTAKEATY 
NDliQVEYGKCGLOMKSLMKKFKEIOTONFSbl^NOSbKKNrSA 
LI KTARVE1 NRKDEEI 


6732 


102 


1205 


GRWORRPPPPSPPLWCUJPGGGSDPQOliTQbRHCLSHSPQDTPW 
AORQVCYTAATT0AAAPATRNCLPDKSGHRPTPPRSHRHHROEN 
LGSIKPSSRS TKATS TTMAGDG RRAEA VR EG WG VYVT PRA P 3 R E 
GRGRbAPONGGSSDAPAYRTPPS^QGRREVRFSDEPPEVYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVP.ESAYYLRSRQRRQPR 
P0ETEEMKTRRTTRL00OHSEQPPLCPSPVMTRRGLRDSHSSEE 
DEASS0TDLSOT3 SKKTVRS 3QEAPAVSEDL.VI RLRR PPLRY PR 
YEATS VQQKVNFSE EGETEEDDQDSSH SSVTTV KARSRDSDESG 
DKTTRSSSQYIESFW 


6733 


613 


1311 


RSCRCVGMR S RNQGG ES ASDGK 3 SCPKPS 1 1 GNAGEKSLSEDAK 
KKKKSNRKEDD\WASGTVKRHLKTSGECERKTKXSLELSKEDL3 
QhhS 3 MEGELQAREDVIHMLKTEKTKPITVLEAHYGS AEPEKVLR 
VLHRDAI LAOEKE3 GEDVYEKP1 SELDRLEEKQKETYRRMbEQL 
LLAEKCHRRTVYELENEKKKHTDYMNKSDDFTNLLEOERERbKK 
LbEOEKAYQARKE 


6734 


189 ' 551 

1 


SAAMFPVFSGCFOELOEKJ'JKSLEbVSFEEVAVHFrWEEWQDLDD 
AQRTLYRDVM1,ET YS S 1.VSLGKCX TKPEM 1 FKLEQGAEPW I VEE 
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SEC 
ID 
NO: 


Predicted 1 Predicted end 
beginning j nucleotide- 
nucleotide 1 location 
location i correspond* no 
corresponding j to first 
to firs-, 1 amino acid 
amino acic ; residue of 
residue cf I amino acid 
amino acid | sequence 
sequence 


Amine acid segment contain, ::o signal peptide 
(A-Alanme, C=Cyateine, D-w-.i partic Acid, E = 
Glutamic Acic, F=Phcnylale: : ne , G=Gjycine, 
H-Hist idine, 1 = I sol cueine, ^-Lysine, 
L = Leucine, M=Me thicr.ine, N- Asparacme . 
?=?rcline, G=Glutaimne, R=Arginine, 
S-Serine, T --Threonine , V=Vj*iine, 
fc' = Tryptophan, Y-Tyrosine, >:-Unknown, * = Stop 
Codon, /"possible nucleotide deletion, 
\=^pcssibie nucleotide insertion) 




I 


TLN LRLSGGS K KCV f SGI CKRS LV2U ^ VH LV 


6735 


28C 


S58 


KSR RAGVTKMS NPFL KG; V FNKDX TFR I X R K FEPGTQRFELHKKA 
OAS LNAGLDLR LAV"OI*PPGE DLNDWV i VK WDFFNRVNLI YGT I 
XDGCT 


6736 


1S1 


808 


MN Y E I>0 F K R E M PN 3 K S LGL'J K LN FLLK M >S S V LP L I TDY VYFEN 
SSSm-YLl RRI EELNKTASGWEAKVVCFYRRRDI SN7L1MLAD 
KJlAKElEEESETTVtADLTDKOKHQLKr'.RELFLSROYESI.PATH 
3 RGXC5V7\LLNKTES VLSYl.rKEDTHl- YSLVYDPSLXTLLADXG 
El RVGPHYCAD1 PEMLLEGTFFCVFAV 1 


6737 




1209 


PV i MPLHFS PGD1 VK P SCCVSS SPKLF. 'nNAHSHbESYRPDTDLS 
KEDTGCNLOHI SDRENIDDLNMEFNP* I-KPRASTI FLSKSQTDV 
REKRKSLF1NHHPPGCIARKYSSCST1 FLDDSTVSQPNLKYT1K 
CVALA1 YYH I KNRDFDGRMLLD1 FDEN LH PLSKSEVP PDYDKHN 
FEQKOJYRFVRTLFSAAOLTAECAIVTLXnfLERL.LTYAEIDICP 
ANWKR I VLGAI LLASKVWDDQAVWNVDYCQl UCDITVEDHNELE 
R0FLELL0FN1WPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAUK LEA JSRI,CEDKYKDLRRSARKR5ASADNLTLFRWSPAI IS 


6738 




653 


CACAFOPARAE VG AATALPVRW ASGEN/ . 1 SG S LAVFLA VLVLLb 
WC7\PWTHGRRSNVRV1 TDENWRELLEGI "MI EFYAPWCPACQNL 
QPEWESFAEWGFPLEVNIAKVDVTEQPC ISGRF3 1TALPTI YHC 
KDGEFRRYQGPRTXKDFINF 1 SDKEWK5 1 EPVSSWF 


6739 


3 


631 


SWPDMAKE EVA K.L£KhLML3 .RQEYVKL C KKUAETEKRCALLAAQ 
AN K F, K SSF S F I S RLLA 1 VADLY EQEO Y f PLK 1 KVGDRH I S AHKF 
VLAAPSDSWSLANLSSTKELDLSDANPrVTMTMLRWl YTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVKSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEI I AS} IWVSEVE\~ VNKAI* 


6740 


3 


631 


SWPDKAEEEVAKLEKh'LMLLhOEYVKU KKLAETEKRCALIAAQ 
ANKESSSESF1SRLLA3VADLYE0EQY-' TJLK3 KVGDRH JSAHKF 
VLAARSDS WS LAN LS S TK E L D LSDANPi ' VTMTMLR W I YTDE LEF 
REDDVFLTELNKLAKR FQU>LLRET?CEXSVMSLVNVRNC3 RFYQ 
TAEELN ASTLKNY CAE 1 1 ASH WVSEVEC- VNKAL 


6741 


143 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPV 1 LATAG YDHTVR FWQA 
HSGlCTRTVQHODSQVNALEVTPDRSMi -AAVQPVSLGYQHIRM 
Y Dl.-NSIW PN P 11 S YDG VNKN 3 ASVGFHL IXSR WMYTGGEDCT ARI 
WDLRSRKliOCORIFOVNAPINCVCbHPKOAELIVGDOSGAIHIW 

dlktdhnec'li pepevsitsahi dpda> ymaavnstlvpfscll • 
plajgii<;kgefesi>arrgllflacg^k^yvwnltggigdevto 

L1PKTKIP 


6742 


141 


960 


PLtLPFSSRARAGHTMNTSPGTVGSDPV: LATAG YDHTVRFWOA 
HSGI C7RTVQH0DS0VNALE VTPDRSM j AAAVQP VSLGYQH I RM 
YDLNSNNPNPI 1 S YJDGVNKKI ASVGFHF.DGRWMYTGGEDCTARI 
WDLiRS RNLQC0R I FQVNAP I N CVCLH PKCAELI VGJDQSGA I H I W 
HI .JCTnHNEOLtlPFPPV c 57T i ;AV} 1 DPDA ^ v MAAVN <?TLVPFS CLL 
PLA1 G3 LOEGEFESlJ^RRGliLFLACQGNCYVWlvLTGGlGDEVTQ 
L3PKTX1P 


6743 


1 | 412 

i 
1 

1 


MKSTODKSLHLEGDPNPSAAPTSTCAPFKMPKRJSISKOIASVK 
ALRKCSDLEKA1 ATTALI FRN SSDSDG K^EKAI AKDL.LQTQFRN 
FAEGQETKPKYRE I LSELDEHTENKLDF LDFMIIiLLS ITVMSDL 
LONIR 


674 4 


55 


1343 


RTPARNR CAGCEVLS RFSSPN KAS S FALL'S AGGGL PA VRALRRD 
ROK VSTVG YGMDEVEQDQHBARLKEXFDS FDTTGTGS LGQEELT 
DI,CKMLSLE2VAPVL0QTUjODNLLGRVKFDQFKEAL3LIliSRT 
L£NEEHFOEPDCSLEAOPKYVRGGKRYC-KRSLPEFOESVEEFPE 
VTVIEPLDEEARPSHIPAGDCSEHWKTOFSSEYEAEGOLRFWNP 
DDLNASCSGS SPPODW I EEKLO-EVCEDLG j TRDGHLNRKKI.VS I 
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I 



WO 01/533)2 



PCT/US00/342ft3 



SEO I 

; ID 

NO: 

1 
i 


Predicted 
beginn.ino 
micleoticr 
location 
corresponcmg 
to first 
amino acic 
residue ot 
amino acic 
sequence 


Predicted end 
nucleot ice 
location 
corresponding 
to first 
amino oeid 
residue of 
amino acid 
sequence 


Ammo acio seoment containing bicnal peptide 
(A-AIanine, C=Cysteine, I>Ar.perLic Acid, E= 
Glutamic Acid, FxPhenylalanine , G-Glycire. 
H=Hist idine, 3 =1 soleucine , K- Lysine. 
L«=Leucine, M- Methionine , N=At:paragint , 
P^Proline, 0=Glutamine, R=Arginine, 
S=Senne, T-T:ireonine , V=Vsl:ne, 
W» Tryptophan, Y=Tyrosine, X^Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 









CEQYGLONVDGEKLEEVFKNLDPDGTMSVEDFFYGLFKNGKSLT 
PSASTPYROLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERJ LDTWQEEG 1 ENSQE J LKALDFGLDGNI NLTEL 
TLALENELLVTKN 5 I HQAC1 


6745 


3 


S88 


TFRIWWAQRRRWLLGCASWESWEAA1AAGPGLPSSTARQ0NNP 
AAGTEC FAAVWAR GTAMGS VLS VDSG KS A? AS ATARALERR R DP 
ELP VTS FDC AVCLEV LHQP VRT RCGKV FCR SCI AT SLKNN K WTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAFXDTLVCLSEMRAHI 
RTC0KY1DKYGPL0ELEETA 


6746 


11C 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSOATEKSSYFCTTEI 
S LWTWAAI QAVF K KMESQAAR LOS L EG K TGTAE KK liADCE KrtA 
VEFGNOLEGKWAVLGTLLOEYGLU?RRLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAVVFTEEEI^GLLDLAORXLYRDVMLENFRNLLSVGH 
0PEHRDTFHFLREEKFWMMD1AT0REGNSVYAGVC 


6746 


201 


665 


MTTFKEAVTFKPVAWFTEEELGLLDPAORKLYRDVMLENFRNL 
LSVGNCPFHQDTFKF1/3KEXF1W4KTTSOREGNSGGK1 Ql EMET 
VPEAGPHEFWSCOOI WEQI ASDLTRSCNS I RNS50FFKEGDVPC 
QIEARLSISXVQQXPYRCNECXO 


6 74 9 


95 


719 


RREVKGGDGVCPR7-J?GSPQS00FPSCAGGGEGL00SGEALDGAM 
SAGGPCPAAAGGGPGGASCSVGAPGGVSKFKWLEVLEKEFnKAF 
VPVPLLLGEI DPDOADI TYEGRQKJlVlSl.SSCFAQhCHKAQS VSQ 
J NH KLEAQLVDLKS KL'l' El'QA EKVV1.EKEVHDOLLOLH S 1 OLQL 
HAKTGQSADSGTI KAKLSGPSVEELERELKAN 


C750 


3 




SCESRRPGAKl^AfGALPRDTTGLGSEOPSGDVAOSNRATMGT 
TA P G P 1 H LL ELCDC K LM EF LCNM DN KDLVW LEE I QEE AER M FTR 
EPS KEPELMPKTPSOKNRRKKRR 3 S YVQDENRDPI RRRLSRRKS 
RSSQLSSRR 


[ 6751 




1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCIIOMSGSTTT1V 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
OKAFEGYNVCI FAYGQTGAGKSYTMMGKOEKDOOGI 1 PQLCEDL 
FSR 1 NDTTNDNMS Y5 VEVS YMEI YCER VRDLLNPKNKGNLRVRE 
HPLLGPYVEDLSKLAVTS YND 1 QDLMDSGNKARTVA/^MNETS 

SRSKAVFN12 ftokrkdaetnittekvsk: slvdlagseradst 

GAKGTR LKEGAN I NKS LTTLGKV 3 S AI AErXDSG PNKNK K K K KTD 
FIPYRDSVLTWLLRENIiGGNSRTAMVAAl^PADINYDETLSTLR 
YADRAKOIRCNAVINEDPIWKLIRELKDEVTRLRDLLYAOGLGD 
I TDM7NALVGMS PSSSLSALSSRNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSDIXMQYSKHCEHLLERLNKOREAGFL 
CDCT J V J GEFOFKAHRNVIiAS FS E Y FG A I YRSTS ENNVFLDQSQ 
\n<AXX?FOKLLEFIYTGTLNLDSWNVKEIHOAADYLKVEEVVTXC 
KIKMEDFAFIANPSSTEISS ITGNI ELNQOTCLLTLRDYNNRSK 
S EVSTDL I OANP KOGALAKKSSOTKKK K KAFKS PKTGQNKT VQY 

TFPAQD I VHTVTVKRKRGKSQPNCALKEHSMSNI ASVKSP YE AE 
NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPyV 
CHLCGKAFTnCNOLKTHVRTHTGEKPYKCELCDKGFAOKCQLVF 
HSRMHHGEEKF YKCDVCNLQFATSSNLK T H AR KHSGEKP YVCDR 
CGQR FAQASTLTYKVRRHTGEKP YVCDTCG KAFAVSS S LI THSR 
KHTGEKPFI CELCGNS YTDI KWLKKHKTICVHSGAJDXTLDE SAED 
HTLSEQDS 3 OKSPLSETMDVKPSDMTLPLJVLPLGTEDHHMLLPV 
TDTOSPTSDrLLRSTVNGYSEPOLlFLQQLY 


6753 


2 


1305 


VPSLPYPPOKWAHTEFTTSSDSETANGIAKPPPVMPGGEEKAS 
PFG 1 KLRRTNYSliR FNCDQOAEOKKXKRHSSTGDS ADAGF PAAG 
S/ttGEKEMEGVALKHGPSLPQERKOAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSOTPAPEHDKAANKMPLAOKPALAPKPTSOTPPAS 



547 



BNSOOCID <WO 0tS33«A1.l : 



WO 01/5331? 



PCT/USOO/34263 



SEO 


Predict ec 


Freaicted enc 


^minc acaa ^eyment cont.iinino siqnbl peptide 


ID 


bee inning 


nucleotide 


( A^Al^.nine, CtCysteine, D^Aspf.rtic Acid, E= 


NO: 


nccl eotide 


location 


Glutamic Acid, F= Phenyls la ni r.e , G=Glycine, 




j ocaticn 


corresponding 


H*Hist idine, 1 = Isoleucine , K=l.ysine, 




corresponding 


to first 


i.=Leucane, K=Methicnine, N=Aspr-ragine, 




to first 


an; no acid 


7^ = Prcjine, Q»=Glutanune , R=Ar$imne ( 




arr.: no ecid 


residue of 


F=Serine, "^Threonine, V=ValinL, 




residue of 


amino acid 


V? = Tryptophan, Y=Tyrosine, X=Ur.knowr>, *»Stop 




^mino acid 


sequence 


Codon, /-possible nucleotide deletion, 




sequence 




\=possible nucleotide insertion) 








PLSKLSR PY LVELLSRRAGRPDPEPSEPSK £ DCESSDRRPFS PP 








GPEEKKGQKRDEEEEATERKPASPFLPATOOEKPSQTPEAGRKE 








K PMLQSKHS 1»DGS KLTEKVETAOPI/W J TLAbOKOKGFREQQATR 








EERKCAREAX0AEKLSKENVSVSV0PGSS5VSRAGSLHKSTALP 






( EEKRPETAVSRLERREQLKKANTLFl'SVTVFISYSSPAAPLVKE 






1 VSKRFSSPDDAPVSSEPAWLAIJSiKRKAKAWSDCPLIIK 


■ 6754 




413 


J VRRRKRRLGGPEVNTMSSLHKSRIADFODVl.KEPSl ALEKLRE 
IS FSG 1 PCEGG LRCLCW KI LLNYLP UERASK TS I LAKQRELYAQ 
PLREMI IQPGl AKAKMG VSREDVTFE DHPLK PN PDSRWNT Y FKD 
KEVLL 


67S5 




1343 


rGL0L.GVAhEADW'rLDMPGGRRGPSROO^R^ALPSLQTLVGGG 
; CGNGTGLRNRWGSAlGLPVPPlTALITPGPYRHCOirDLPVDGS 
L LFE F LFF I Y h LVAL F 1 0 Y I N I YKTVW WY p VK H P AS CTS LN FHL 
3 DYHLAAF1TVMLARRLVWAL1SEATKAGA/.SMI HYMVLI SARL 
VLl.TLCGWNO.CWTLVNLFRSHSVLNljLFEGrPFGVYVPIjCCFHQ 
| i;5TRAHLLLTDYNYVVQHEAVEESAST'/GGIJlKSKDFL»SLIiLESL j 
KEQFNNATPI FTHSCPLSPDLIPJ4EVECLKADFNHRIKEVLFNS 
I FSAYYVAFLPLCFVKVSGYUTFMCFLDLCV7CYJNWVFLV 


6756 


iec 


754 


1 1 ERALGSLPhSIPVSWGSLRTLKYQQQPLRI K-VLLCQTRVOCHD 
( LRSLQPQPPGLKOSFCLRVLGlOTGATTPGl.RDbTCKELl-LTE 
RFAOKRKKRKEKESGMALTQGPLTFRDVA1EFS0EEWKSLDPVO 
KALYWDVMLENYRNLVFLGKDKFAI.EVKICFRVFLYFLCCLSWE 
PFHYLTZTEALLTHK 


6757 




45S 


KSRVEAPEAHSRESOGSDAMRKHLSWWWLATV'CMLLFSHLSAVO 
TffGIKHR ] KWNR KALPSTAQITEAOVAENRPC-AFI KOGRKLDID 
FGABGNR Y YEAN V WQFPDG I H YNGCS EANVTK EAFVTGCI NATQ 
AANCCEFQKPDNKLHQQVLW 


6758 


a 


iooe 


A^GPEl/PGRRFR DRAPWL»PAR1jLRGV1*AVWVSLSA1»GPGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
i.F PS FRRNMANK S P ALTGNS OPOHOAAAAAACOQGOCGGGGATK 
PAVSGKOGNVLPLWGNEKTWNLNPKilLTrJILrSPYFKVOLYELK 
TYHEWDEI YFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGGI 
VETAFCH.YKLFTLKLTRKOVMGLITKTDSP V I RALGFWY 1 RYT 
QPPTDLWDWFEEFLDDEEDLDVKAGGGCVMTDGEMLRSFLTKLE 
WFSTLFPRIPVPVOKNIDOQIKTRPRKI 


67 59 




513 


KKHNFHSLDGTSTRAFHPQTGLPLLSSPVPO^KTOSGCFDLDSS 
LLHL.KS FS SRS PR PCLNI EDDPDI HEK PELS S S APP I TSLSLLG 
NFEE^SVLNYRFUFLGI VEXSFTAEVGASGAFCPTHLTLiPVEVSFY 
SVSDDNAPSPYMGVITLESLGKRGYRVPPSGTIOWCVL 


6760 




606 


VI SKKXGLSAEEKRTRMME1 FSETKDVFQLKDLEKl APKEKG1T 
AKS VKEVLOSL VPDCMVDCER 2 GTSW YWAFPS KALHARKHKLE 
VLESQLSEGSQKHASLCKSIEKAKIGRCETEE^T 


6761 


29 


1733 


EFTLRGLREVAAPSDVADAAVSRRGRCCCCbHCTOTQVAQDCPS 
55 S S VOR CELSLFQSLHTMTS K KLVNS VAGCADDALAGLVACNP 
NLCLIjOGHRVALRSDLDSLKGRVALLEGGGSGUEPAHAGFIGKG 
MLTGV I AG AV FTS PAVGS I LAA 1 RAV AQAGTVGTLL 1 V KNTYTGD 
R LN? GLAR EQARAEG 1 PVEMW2 GDDSAFTVLKKAGRRGLCGTV 
1 1 HK\^A1J^AGVGLEEIAK0VNVVTKAMGT1^VSLSSCSVTG 
S KPTFELS ADEVELGLG I HGEAGVRR 1 XMATADE I VKLMLDHMT 
NTTNASHVP VQPGSS WT4MVNNLGGLS FLELG J I ADATVRSJUEG 
RGV KIARA1VGTFMSALEMPG3 SMN..U.VDEPLLKL2 DAETTAA 

AWPNVAAVS : tgrxrsrvapaepqeapdstaaggsaskrmalvl 

ER V CSTLLG LEEHLNALDRAAGDGDCGTTHSRAARAI QEWLKEG 
P P FAS PAOLLS ICS VLLLEKMGGS sg al yglfltaaaqplkakt 
SLFAWSAAMDAGLEAMQKYGKAAPGDRTMLDS LWAAGQBL \ 
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SEO 
JO 
NO: 


Predicted 
beg i nn i nc 
nuciect irie 
1 oca t a or. 
corresponding 
to firs; 
amir.o acid 
residue of 
ammo acid 
sequence 


Predicted end 1 *mno £cid segment containing signal peptide 
nucleotide I (/^Alanine , C=.Cyeteine, rKAspartic Acid, E= 
location I Glutamic Acid, F- Phenylalanine , G=Glycme, 
corresponding | iUtf istidine, 1= Isoleucine, K=Lysine, 
to first I L=i,eucine, M=Met hionine, N=Asparscine , 
amino acid | P=Proline, Q=G1 utamine . R-Arainine, 
residue of |S=Serine, T^Threonine, V=Valane, 
amino acid 1 WsTryptophan, Y-Tyrosine, X^Unknown, *=Stop 
sequence ! Cocon, /=possible nucleotide deletion, 
! \-possible nucleotide ;r»Gertion) 


6762 




613 ! ASTISWRLCVAGAEARRPVPVAGLRAGGGAMWFMYLLSWLSLFI 
| OVAFITI^VAAGLYYLAELIEEYTVATSRl I KYMIWFSTAVblG 
| L Y V FER F PTS M I G VG L FTNLVY FGLLOTFPFI MLTSPNF I LSCG 
| LVVVNHYLAFQFFAh'EY Y PFSEVI.AYFTFCLWI I PFAFFVSLSA 
| GENVLPSTMQPGDDVVSNYFTKGKRGK 


6763 




76C 


SGFDFPGRRFRGCCCVRPPAGAGHELGGKWDMNSAPRLVSEIAE 
PK0E0KTGTEAEAADSGAVGARRFLLCLYU3GFLDLFGVSMWP 
LLS LHVKSLGAS PTVAG IVGSSYGI LQLFS STLVGCWSD WGRR 
S S LLAC I LLSALGYLLLGAATNV F \> FVLAK VPAG I FKHTLS I SK 
ALLSDWFEKERPLV1GHFNTTASGVGFI bGPWGGYJiTELBDGF 
YLTAFICFbVFlLNAGLVWFFPRREAKPGSTE 


6764 


80 


436 


IjK KM DTMKLS VRNLFEQL VRRV E I LS EGNE VQF I QLAKD FEDFR 
K K WQR TDV) ELG K Y KDLLM KAETER S ALD VKL KJIARNQVDVEI KR 
RORAEADCEK1XR01 QLI REMLMCDTSGSI £ 


6765 


3 


550 


AR Y S R VDH F CRRRCRA VARAPR FLLOFPSG PS RH FLAAC VARWL 
ACrSVLiVitAljoCjoAKDGI VI fcVAVGVKJ<GiDtLljoGoVLii>i>PNS 

nmssmwtangndskkfkgedkmdc;apsrvlhir.klpgevtete ; 
V 2 ai>glpfgkvtni lmlkgknoaflelateeaai tngnyys avt 1 

PKLRNQ i 


6^46 


D 


2 2 87 


EGGSFXASLTWLWPLGEMKLHCEVEVI SRHLP AliGLRNRG KG VR ; 
AVLSLC00TSRS0PPVRAFLL1 STLKDKRGTRYEL.REN3 EQFFT ' 
KFVDEGKAT\^LKEPPVD1CLSK;.-NSSSLKGFLSAMRLAHRGCN , 
VDTPVSTLTPVKTSEFENFKTKWVITSKKDYPLSIOJFPYSLEHL ''■ 
QTSYCGLVRVDMRMLCLKSLRKLDLSHNHI KKLPATIGDLIHLQ 
ELNbNDNHLBSFSVALCHSTLOKSLWSLDLSKNKIKALPVQFCQ 
l^ELKKLKLDDNELlQFPCKIGOLlNLRFLSAARNKLPFLPSEF 
RNhSl.EYbDbFGNTFEOPKVLPV: KLOAPLTLLESSARTl LHNR 
I PYGSHI 3 PPHLCQDL.DTAXI CVCGR FCLNSF 1 QGTTTMNLHS V 
AHTWbVDNLGGTEAP J 1 S Y FCSLGCYVKSSDl 


6767 


336 


915 


APM1 CLCSSDLQFRYKEAFLRDRGLOI GYCSVDDDPRMKHFLNV 
GRLOSDNEYKKDFAKSRSQFHSSTDOPGLIjOAKRSOOI^ASDVHY 
ROPL.PQPTCDPEOLGLRHAOXAH0LOSDVKYKSDLNLTRGVGWT 
PP<?SYKVEf4ARRAAEl^ARGLGLOGAYRGAEAVEAGDH0SGEV 
NPDATEILHVKKKKALLL 


6768 


2 


363 


PGETISCYLLSEGSLPbCMQVACGEEKHRAPTOKTLRARFKKTE 
LRLSPTDLGSCPPCGPCPIPKPAARGRROSQDWGKSDERLLOAV 
ENNDAPRVAAbI AR KGLVPTKLDPEG KS AFHL- 


6769 


284 


396 


MSTPDFSTAENNOELANEVSCLKAMLTLMbOAMGQAD 


6770 




397 


QPJ^YOVIWSSTMAKLHDYYKDEVVKKLMTEFNYNSVMOVPRVEK 
I TLNMGVGEAI ADKKLLDNAAADLAAI SG0KPL1TXARKS VAGF 
KIR0GYPIGCKVTLRGERMWEFFERLITIAVPR1RDFRGLSAKS 


6771 




375 


APAGTLAMTGKS VKDVDR YQA VLANLLLEEDNKFCADCQS KGPR 
WASWN2GVFICI RCAG JHRNLGVH1 SRVK5VNLDQWTQEQI QCM 
OEMGNGKANRLYEAYLPETFRRPOIDPYLPaISNLEG 


6772 


1 


1400 


aaaflogmtvngfintvitsl\errydlhsyosgl:assydiaa 

CLCLTFVS YFGGSG \ HKPRWLGWGR WLMGTGSLVFALPHFTAG 
P * * GWKLDAG VRTCPANPR \ P VCAG \ H TSGLSR YQL VFMLGQFI* 
HGVGATPLYTLGVTYLDENVKSSCSP 1 Y I AI FYTAAI LGPAAG Y 
LI GGALLNI YTEMGRRTELTTESPLWVGAWWVGFbGSGAAAFFT 
AVPJLGYPRQIiPGSQRYAVMRAAEMHOLKDSSRGEASNPDKGKT 
IRDLPLSIWLLLKNPTF1LLCLAGATEATL1TGKSTFSPKFLES 
OFSLSASEAATLFGYLWPAGGGGTPLGGFFVNKLRLRGSAVIK 
FCLFCTWS bLG 1 1*V FS LHCP S VPMAG VT AS YGGS LL PBGHI-NL 
TAPCNAACS CQPEHYSPVCGSDGLMY FSLCHAGCPAATETNVDG 
OKVYRDCSCI PQHLSSGFGHATAGKCTST 
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W O 01/53312 K:T/VS00/3O63 



[" "FEC 
V- 

\ * 

l 


^reci cr.ec 
bc^i nni nc 
r.uclcotic* 
location 
corresponding 
tc first 
r.rrino acic 
residue c: 
amino acic 
sequence 


Piedictec end 
nucl eot i at 
ioc&tict. 
corresponding 
to f irsi 
amino acid 
residue ol 
£w:no acic 
secruencc 


Am no acid segment containing sioral pepcide 
(A^Alanine, C-Cysteine, D-Asportic Acic, E?. 
Giutaroic Acid, F= Phenylalanine , G=Glyc:ne, 
H-Histidine, I=l£cl eucinc, K* Lysine, 
l=Leucine, M=Mcthionj ne, KxAsparscine , 
?=Prcline, 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V-Valine, 
U=Tryptophan, Y-Tyrosine, X-Unknown, *^£top 
Codon, /-possible nucleotide deletion. 
\=possible nucleotide insertion) 


6 771* 


3 


63C 


PWEAPKEHKYKAEEHTWLTVTGEPCHFPFOVHRQLYKKCTHKG 
RPGPQPWCATTPNFDODQRWGYCLEPKKVKDHCSKHSPCOKGGT 
CVTWPSGPHCLCPQHLTGNKCpXEKCFEPQLLRFFHKNElWYRT 
EOAAVARCQCKGPDAHCORLASQACRTNPCLHGCRCLEVEGHRL 
CHCPVGYTG?FCDVGE * GSGAS RRPAPRWDGLAR 


£774 


140 




iTEbSDOOY ?LFr 3 LS S / WVPT FLSKDVDGRV I KAJDSFS K 1 1 SS 
G LiR I G Fl»TG P KPL I ER V I LK 1 0 VS TJUHPS T F NQLM I SO 


"""677-T" 






■j , ci\soi j rvltarggrrapsp0lwtlvlaljeskwrshrli.rkns 
grpetmenlpalyt3fogevamvtdygaf:kipgcrkoglvhrt 
kmsscrvdkpse1vdvgdkvwvkligremkjjdr1kvslsmkww 
qgtgkdld pnnv \ s l£ k krgg gd psr 3 tlgr r s plrls 


fc'77"fc 


3 


noe 


HER HERH EGALSQDALLR I S 1 PLDSNMR P EKC HRF VHPQWOLLH 

7SVAKFV FMAGMMVGG 1 LGGHLS DR FGRR FVLR WCYLQVA X VGT 
CAALAP-l-FLIYCSLRFLSGIAAMSLITNTIMLlAEWATHRFOAM 
GITLGMCPSGIAFMTLAGLAFA; RDWH3LQLWSVPYFV I FLTS 

KSTMXKELEAAQKKKPFLSERLHWPNICKR ISLLPFTKFANFKA 
YFGbNI.HG/LKT-lLGNNVFLLOTl.FGAV/TPPGQLVLHLGHWGSG 
RVS S RGRVNCLGLF VLOVW 


6 7T7 "~ 


775 


63 


CFFHGPAWRDCSVRATFAKKOGOSGIISCJAFSPAOPbYACGSy 

AELLCWDLRQSGYPLWSbGREVTTNORIVFDLDFTGQFLVSGST 
SGAVSVWDTDGPGNDGKPEPVLSFbPQKDCTNGVSbKPSbPLLG 
HCt.PVSVCFbSPTESGGRRRGAGPSLGSPRRHVULECRLOLWWC 


0776 


311 




3 OS I TDES RGS I RRKNPAMTRLF. bNVP\ EBTAGDS E/ERS P EEE 
VOADPRIRSASPKCPTSSPFPKGRSPEGEGET\i;PEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEG3KBFFSMKPEWENLNQSNVRRMK 
TXAVRLNEVIVKKSRDAKLVXLNMPGPPRNRNGDENY 


6 7 79 


7 


53t 


RALRKQPRLLAANGIEPESMA3 S^PIKGSRKPCVNKEELAuKKP 
MAKCAWKGPREPPODARAEAES PGGASESDQDGGHESPPK KKAV 
AW VS AKNPAPMR KKKKVS LGP VS Y VLVDSEDG R K K PVMP K KG PG 
SRREASD0KAPRG0CPAEATAST5RGPKAKPEGSPRRATNESRK 
V 


6 7 8 0 


3 


40 j 


H E VN DNKPE1 N I NLMS PGKEEI 5 Y 1 7EGDF 1 DTFVALVRVQDKP 
SGLNGEI VCKLHGHGKFKLQKT VENMYLI LTNATLDREKR SEYS 
LTV1 AEDRGTPS LSTVKK FTVQ 1 WD I NDNP PHFQR SR YE FV I SE 
K 


6781 


1 




AFTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPSIPPVMSRPVSSSSISTFLPPNQITVFVTSNPirJSANT 
SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNAJIPQFI 
TPVFINSSSIIQVMKGSQPSTIPAAPLTTNSGLMPPSVAVVGPL j 
HI PON I KFSSAPVPPNALSS SPAPN I QTGRPLVLS SRATPVQLP 
SPPCTSSPWPSHPPVOOVKETA'PDEASPOVNTSAIX^ITbPSSO 
STTM VS PLLTNS PG SSGNRRSPVS SSKGKGKVDK I GQILLTKAC 
KKVTGSbEKGEEOYGADGETEGOGLDTTAPGLMGTEObSTELDS 
KTPTPPAPTLLKMTSSPVGPGTASAGPSLPGGALPTSVRS J VTT 
b V PS ELI S A V P 7TK SWGG I AS FS LAG 


67f?2 


3 


1327 


RKPTVIRIPAKPGKCLKEDPOSPPPLPAEKPiGNTFSTVSCKLS 
NVERTRNbESNHPGOTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VFPERPPPPKLSATRRSNKKLPFKRSSSDMDLOKKOSNLATGbS 
KAKSQVFKN0DPVLPPRPKPGHPLYSKYMLSVPHGIANED1VSO 
N PG E LS C KRGD VLVML KQTENNYLECQKGEDTGR VH LSQMK LIT 
PLDEHLRSR PNPFS PPKA PSHAOK PVDSGAPHA WLHDFPAEQV 
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BNSDOCID: <WO_0t5331^A1_l.^ 



WO OJ/53312 PCTAIS00/34263 



SEC 
3D 
NO: 


Predict cc 
beginnan* 
nucleot : ct 
location 
ccr re spending 
to first 
amino acic 
residue ct 
amino ac:d 
sequence 


Pr*c::cted end 
nuc'j eot i. de 
iocat ion 
ccr responding 
tc first 
sniinc acid 
residue of 
amino acid 
sequence 


Aiuno acic :-«-oment containing signal peptide 
tA-Alaninc OCysteine, D=Aspartic Acid, E= 
Glutamic /a:c, F=Phenylalanine, G=:Glycine , 
B=Histidir>- . l=lsoleucine, X=Lysine, 
L=Leucine, ^Methionine , N=Asparagin€ , 
P=Proline, C =Glut amine , R-Arginine, 
S = Serine, '; = Threonine, V=Valine, 
W*TryptoprK r. , Y=Tyrooine, X=Unknovn, **Stop 
Codon, /=p< rsible nucleotide deletion, 
Vpossible r.ucleotide insertion) 








DDLNLTSGEJ vyi/^EKIDTDWYRGNCRNOIGI FPANYVKVI I DI 
PEGGNGKRECV/ SHCVKGSRCVARFEYIGEOKDELSFSEGEIII 
:,KEYVNEEWA- C SVRGRTG I FPLNFVEPVEDYPTSGANVLS7KV 
P I .KTK KEDSG ^ ?■* S QVNS LPAEW CEALKS FTAETS DDLS FKRGDR 
I 


6783 




1750 


SYHKHHAQQSA ».AS PNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPASWGQRJ -" AMVTAINSOKAVLSTDVQNTPVNLOTSSKVT 
G PG AE AVQI VA ?L\'TVyLOVQATPPQP I KVPOF I P P PR LTPR PNF 
LPQ VR P K P VAC PA P P PMLAA PQh I OR P VMLTK FTP7TL 
PTS ON S I H P V K X'VRG QTAT 3 AKTFPMAQLTS I V I AT ?G T R LAG P 
Q1 VQLSKPSLL r^VKSKTETDEKQTESRTITPPAAPKPKREEN 
POKLAFMVSLC-1 VTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 
PERKXSAVTY^iTMHPGTRKRGRPPXYNAVLGFGALTPTSPQS 
SHFDSPENEK7 rTTFTFPAPVOPVSLPSPTSTDGDlHEDFCSVC 
RKSGOLLMCD7C5RVYHLDCLDPPLKTIPKGMW1CPRCQDQMLK 
KEEA1 PWPGT1. ' . : VHSY 1 AY KAAKEEEKQXLLKWSSDLKOEREQ 
LEQKVKOLSNJ 3 S. KCMEMKNTI LARQKEMHSSLEKVKOLI RLIH 
01 DL.S KPVDSY a 7VGA1 SNGPDCTPPANAATSTPAPSPSSOSCT 
ANCNQGEETK 


6784 




175,0 


S Y ) L} 1 HHAQQS A / S PNIjTA S Q KT VTTT SMI TTKTLP LVL KAATA 
TKPASWG0R1 *: :AMVTAINSQKAVLSTDVONTPVNliOTSSKVT 
GPGAEAVQI V; K!;TVTLOVOATPPOP J XVPQF1 PPPRLTPRPNF 
LPOVRPKPVACivKlPIAPAPPPMLAAPOLIORPVMLTKFTPTTL 
PTSUNS 3 HPVR W NGQTAT1 AKTFPMAQLTSIVl ATPGTRLAGP 
OTVCLSXPSLr.CTVKSHTETDEKOTESRTITPPAAPKPKREEN 
PQKU^FHV SIX 1.-V7HDHLRE 1 OSKRQERXRRTTANPVY SC AV FE 
PERKKSAVTY1 N.«. TMHPGTRKRGRPPXYNAVLGFGALTPTSPQS 
SKPDS P ENEKTI 7 TPTFPAPVQPVSLPS PTSTDGD1 HEDFCSVC 
RXSGOLLMCD7CSRVYHLDCLDPPLKT3PKGMWICPRCODOMLK 
KEEAI PWPGTL' j VKSYIAYKAAKEEEXOKLLKWSSDLKQEREQ 
LEG K VKOLSNS : 5 KCMEMKNTI LARQKEMHSSLEKV KQbl KL I H 
G I DLS KPVDS KI TVGA T SNGPDCTPPANAATSTP APS PSSQSCT 
ANCNOGEETK 


6785 


1 


b?.a 


LiGMTVLHY CSWV £ KP ECLKLLL-RS KPTVDl VNQAGET ALD 1 AKR 
LXAT0CEDLLS^AKSGKFNPHV}1VEYEWNLR0EE1DESDDDLDD 
KPSPVKKERSPK l-OSFCHSSS ISPQDKLALPGFSTFRDKQRLSY 
GAFTNQIFVST. 1 '.' DSPTSPTTEAPPLPPRNAGXGPTGPPITPHR 


" T7 86 


182( 


1357 


KS V KVLVI APTr. :•; LANHVSRD FXDI \TRKLTVARFYGGTS YQSQ 
I NH I RNGIDI IAG7PGR I KDHLQSGRLDLSKLRHWLDEVDOML 
DLGFAEQVED3 " HESYKTDSEDNPQTLLFSATCPQWVYTVA\KK 
YMKSRY EQVDL! C KMTC/KAATTVSHLA 1 QCHWSQRFAV 1 GDVLQ 
VYSGS EGRAI I z CETXKNVTErlAMNPni KQNAQCiiHvjUIAUbsjR 
E I TL>KG FR EGS F Y. VLVATNVAARGLDI PEVDLVl QSS PPQDVE5 
Y IHRSGRTGRAC-F TG I CICFYQPRERGOLRYVEQKAG ITFKRVG 
VPSTMDLVKSK5 KDAIRSLASVSYAAVDFFRPSAQRLISEKGAV 
DALAAALAH ISG ASS rSPRSLl TSDKGFVTMTLESLE E I QDVSC 
AWXELNRXLSSN'AVSQITRMCLLKGNMGVCFDVPTTESERLOAE 
WHDSDW1LSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGPCSRQGSRSGSRODGRRRSGNRNRSRSGGHKRS 
FD* VFYHLVDF1 S DFLVDSVYIjTGRQIDHLTGLTGLIDHLTSHS 
SVWM 


6787 


264£ 


2270 


PSSFPKNVPLE: I.EEPPX*KRSGLGSLTPKSQIONGP*PQTFFF 
FELGSPSGVI Sr. CN LR LLGS SDS PAPAS R VAGI IGTCHHAWLI 
LVFbVEKGFHFrs ! GOAGLKLLTL\VIHPPWPPKVLGLQT 


6788 


16 


S36 


G G T V DLR \ DM LA V £ VLAA VR GGR / ATVR R VR ESNVLH E X S KG KT 
R EG AE D KMTSG H V LSNRKM FYLLXTAF P S VQ I NTE EHVD \ ELDQ 
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BNSDOCID: 0153312A7_L> 
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ID 

NO: 


Predicted 
beginninc 
nucleotide 
locat ion 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucl eoti de- 
location 
ccr responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acic r, earner. t containing s?Qnai peptide 
(A^Alanine, C«Cysteine, psAsparrrc Acid, E = 
Glutamic Acid, F=Phenylalanmc , G=Glycine, 
H-Histidinc, I - Iso] eucine , K = Lyfine. , 
L=Leucine, K-Methi onine, N-Asparocme . 
P=Proline, 0=Glutamine, R^Arqmane, 
S^Serine, T^Threcnine , V^-Valmt, 
>?= Tryptophan, Y»Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide oeletion, 
\*=possible nucleotide insertion) 








EVILWGS*D5*GYPKGK« LLPKEVFSR/RVLLSGLT?LDATQE\ 
FTEDLSK\YVTTK»VCVAVKGKPMLGV1 HKP FSeYTAMAMVDGGS 
NV KAR SSYNEKTPRI WS KSHSGMV KOV ALCTFGNQTT 1 1 PAGG 
AGVKVliALLDVPPKSOEKADLYIHVTY; KXVJDICAGNAILKAXG 
GHMTTLSGEEI SYTGSDG1 ECGLLAS1 RKjWJOALVRKLPDLEKT 

ghk 


6789 


2 


678 


GNG 1 NVLK 1 APES A I KFKAYEQ I KRLVW » • P GDS » G F/YERLVA 
GSIAGAIAOSSIYPMEVLKTRMAI.RKTGOYSGMIjDCAKRILARE 
GVAAFYKGYVFNMLG1IPYAGIDIJVVYETLXNAWL0HYAVNSAD 
PGVFVLIJVCGTMSSTCGOIASYPLALVRTPJ'IOAOASIEGAPEVT 
MS S LFKHJ LR TEGA F G L YRGLAPNFM KVIPAVSIf YWYENLK I 
TLGVQSR 


6790 




4068 


APPAGRRRMQAAPRAGCGAALLLKIVSSCLCRAWTAPST5QKCD 
EPL.VSGLPHVAFSSSSSISGSYSPGYAK1NKRGGAGGWSPSDSD 
HYQWUJVDFGNKKOISAIATQGRYSHSDWVTQYRMI.YSDTGRKW 
K P YHQ DGN I WAF PGN 1 NS DG WRH El.QH PI IAR YVR 1 VPLDWNG 
EGRIGLR1EVYGCSYWADVINFDGHWLPYRFRNKKMKTLKDVI 
ALN FKTS ESEG VI LHGEGQOGDYI TLELKKAKLVLS LNU3SNQL 
G P I Y GHTS VMTG $ LLDDHK W J J SW I E RQGRS 1 NLTLDKSMQHFR 
TNGEFDYIiDLDYEITFGGl PFSGKPSSSSRKMFKGCMESINYNG 
VNITDLARRKKLEPSWGNLSPSCVEPYTVPVFFNATSYLEVPG 
RLNODLFSVSFOFRTWNPNGLLVFSHFADHLGNVEiDLTESKVG 
VHI NIT0TKMS03 Dl SSGSGLNDGQWHEVR FLAKENFA1 LTIDG 
DEASAVRTNSPLQVKTGEKYFKGGFLNOWKNSSHSVLOPSFQGC 
MQL1 0VDDOL.VNLYEVA0RKPGSFAKrVS:DMCAl 1DRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATOINS i Y EPSCEAY KHLGQT 
SNYYWIDPPGSGPLGPLKVYCNMTEDKVWrJVSHDLOMQTPWG 
YNPEKYSVTOL.VY5ASMDQ1SA1TDSAEYCE0YVSYFCKMSRLL 
NTPDGS P YTWWVGKANEKH Y YWGGSG PG 2 Q K CACG 3 E RJi CTDPK 
YYCNCDADYKOWRKDAGFLSYXDHLPVSOVWGDTDROGSEAKb 
SVGPLRCOGDRNYWNAASFPNPSSYLHFSTFOGETSADI SFYFK 
TLTPWGVFLENMGKEDFIKLELKSATEVSFSFDVGNGPVEIWR 
SPTPLNDDOWHRVTAERhA/KOASLOVDRLPOOIRKAPTEGHTRb 
E hY SOLFVGGAGGOQGFLGC j rs LR*«*NG VTLDLEERA KVTSGF I 

sgcsghctsygtncenggkcleryhgyscdcsntaydgtfcnkd 
vgaffeegmwlrynfoapatkardsssrvdnapdoohshpdlao 
ee i r fs fs ttkapci lly1 ss fttdfliavlvkptgsloi rynw3 
gtrepynidvdhrnmangophsvnitrhektiflkldhypsvsy 
h lps ss dtlfns pks lfiigkv j etgk i dqe i h kyntpg ftgcls 
r vqfnq1 a plkaalrqtnas ahvh 1 qgelvesncgas pltlspm 

SS ATDPWHLDHLDSASADFPYNPGQGQAIPjNGVNRNSAI I GGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPGRRPSM2WDPPTSQRP1DESKKEWPHLRGGYLAMG 


6791 


1802 


1193 


«T/-*tj r?f^ +L t/f» TM/"/**rM//*-r\T fntir'trcrrMiftii/riJvrvD'^TDDP' 
IGHSGAKGEKGDKGUW^PKGEKViyH^PKtafcKv? i Fv>I fc*V?W 

SAW* SWLTAASTKV0AILLP0PLE*LGL0IAFMA5LA?HFSNQ 

NSGI 1 FSSVETN I GN F FDVM'IGR FGAPVSGV Y F FTFSMMKHEDV 

EEVYVYl^HNGimFSMYSYEMKGKSDTSST^VbKLAXGDEVW 

LR MGNGALHGDHQRFSTFAG FLLFETK 


6792 


33 


1073 


VR H TNVJG VDM YLFS LGSES PKGAIGHIVSTEKTI LA VE RNKVLIj 
PPLWraTFSWGFDDFSCCLGSYGSDXVLMTFENLAAWGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRC-LRLROAJLYGHTOAV 
TCLAASVTFSLLVSGSODCTCILWDl.DHLTh'VTRLPAHREGISA 
IT1 SDVSGTIVSCAGAHLSLWNVNGQPLASJ TTAWGPEGA ITCC 
CLMEGPAWD7SQI 1 1TGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPXRPQVGEEPGLESRAGR ♦ HCFDREAQQNQP\PVTAI> 
AVS RNHTKLLVGDEKGRI FCWSADG * EERGSRGSGTTVPG 
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SF-0 
ID 
NO: 


Predi ctec 
beg a n nine 
nucleot lot 
locot ^ or. 
correspond i ng 
to first 
amino acic 
residue cl 
amino acic 


Precicted en<3 
r.uc . tocicip 
loc. Lion 
cor : cc-pond:ng 
to i rst 
atnir.c acid 
res : cue of 
ami::C acid 
seqvence 


>*.r.ino .'icid £.rgment containinc sjq na 1 rx-pti ce 
«A = /iianine, C^Cystcirie , D-Aspartic Acid, r- 
Glutamic Acid, F- Phenyl alanine , G=Glycine, 
H-Hi st idme, 3 - 3 soleucine, K=Lysine, 
L= Leucine, M=Ket hiom ne , N=Asperaglnt 
P= Proline, Q=G3utamine, R^Arciinine, 
S- Serine, T=Threoni ne. v.Valint, 
WxTryptophan, Y = Tyrosme, X«Ur»known, "-Stcr 
Cooon, /^possible nucleotide deieticr., 
\=possjhle nucleotide insertion; 


6793 


23 4 V 


EOS 


GRKF.ANY\YGSLTCAGTVSlX?LDAEGOEVFVPFSAVl,PKVAFKD 
1>VFIX5TO3SS1jN1AEAMRR.^KVLDWGLOEOLKPHMEALRPRFSV 
Y 3 P E F I AANQS AKADNLI ?GS RAQQLEQ3 KRDI R DFR S S AGL DK 
V 1 VLWTANTERFCEV 1 PGLNDTAEN1»LRT1 ELG1.EVS ? STLFA V 
AS 1 LEG CAFLNGS PGWTL V FGALELAWQHR V F VGGDD r " K SGQTK 
VK S VLVDPL 3 G SGLKTMS 1 VF YNHLGWNDGENLS APLC FR $ KE V 
SKSNWDDMVOSNPVLYTPGEEPDHCWlKYVPYVGDf'KRALCE 
YT£ ELM1X-GTNTLVLHNTCEDSLLAAP1 MLDLALLTE LCQRVSF 

ctdmdfefqtfhpvlsllsflfkaplvppgspvvnalfrqrsci 

EK 3 LRACVGLPPCNW;LLEHKMERPGPSLKRVGPVAATY PMKNK 

kgfvpaatngctgdakgkloeeppmptt*gpghtvsr:.flpaap 
hdptlkaptnkgrchfsppstwgswgl 


6794 


36S 


i 349 


rr>VJCKK PEAS ah*ekpgppskpgvhggreraggrgshgars CRN 
EPAP PAPAPPEDHPDEEMGFT3 Dl KSFLKPGEXTYTC* CRLFVG 

nlptditeedfkrlferygepsevfinrdrgfgfiri.ksrtlae 
1 akaeldgt1 lksrplr1 r?athgaaltvkn1_s pwske lleqa 
f i 0 fg p ve ka wwddrg ratg kgf ve faak p par ka 1 < f. r cg dg 

AFLLTTTPRPVjVEPMEOFD^EDGLPEKLMCXTOQYHKEREOPP 
KFAOPG'J'FEFEYASRWKALPEMEKQOREOWKNIREAKFKLEAE 
KL; J\RH EHQLMLMR0DLMRRQEELRRLEELRN0EL0KF KQ1QLR 
KEEEHRRREEEMIRHREOEELRRQQEGFKPiNYMENYVCHFbR 


679b 


1740 


1010 


GPRRQTQVRDHELDSF^DWAA.QETDCAQNSGERb* XGV/ LENFS 
TKS KSAVK I SLDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNC 
EKV7JQ1 QXTV1 EPLKXFGSVF PSLNMAVKRREOALQDV RRLQAK 
VEKYEE^KTGFVLAKLHOAREELRPVREDFEAKNRQI l.EEMPR 
FYGSRLDYFOPS KE S LI RAQ W YYS EMHKI FGDLS HQ-JDQPGHS 
DFCR ERENEAKLSELRALSI VADD 


67 96 


48 


683 


GKEl^T PTI KLAWLLFGLE- P V G ALGKG WS F * * SHVALGQLGW 
I/}KAVRSSWRWELCVSA0EWS0RSA*SSPSFVGACP5LNPPET 
SVCEGRDCWQR* LPRLFSALVGQPGCWPQGAFPERCV * F GRCKW 
HLCSQVLR * ERRRCCRCLPR FA* GWRRRHCRLGLG IHPA PLGST 
SPI HPEGNSCOCRR*GWAAELRLPSSWL*GKliGC* 


6797 


3620 


213 


TEKM7PSQPTRGSSCTRPSSKLWTSTWRCLTCKWAGMRKSVVGV 
TLG PMAQGLLS ASG1TTEATWTRPTTHLTLI RKWLLTAS." RVDPF 
£RPPPPP£DDLTLI,ESSSSYKNli/DAOIPO/DV;SMSFrTSG*RP 
UTS RAS S I MRS RTA I PS AS * S R LTTKH7VGGS PS A WR h X PTS RS 
VS7 F VS S S TETTAS GS CLTWK S SS PAPCPSS S A PAHS FEASCCK 
TS LWG S CGG S G DGS S ACGSG K N LS MAGTS CS S P AM CS E ' £ RAP S * 
RSASRPRTWRATTSAASSWAFRRCWCGWA*SAT* PSS7TTI SSS 
?f^CGWPCPASCASAAAWLSSTWATASVAGSCWGPlM*rS;\HSPW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCSASSSRSSPAPTTFSSIPAAQAQRRASCRPTSKSART 
AF PF AS S AAGAAR PAAFSAAAEGTPRRS I RO* 


6798 


3694 


1 696 


ST 3 SWESLESWLNKATNPSNRQEDWEY11GFCDQ1NKLLEG*VS 

alwgolrgsglgrgttmakegopgsprlsalecvllvpoXpoia 
vr llah k i ospoewealqalt y lgdrvs e kvktk v i elly s wtm 
alpeeak i kdayhmlkrqg3 v0sdppi pvdrtli psp ppr pknp 
vfddeekskllakllksknpddlqeankl i ksmvredear i qkv 
tkrl^tleevnn^rllsemllhysqedssdgdrelmkelfdqc 
en krrtlfklasetedndnslgd i lqasdnls rv 1nsykti i eg 
07 1 ng e vatltlpds egnsocsnogtli dlaeldttks i .ss vloa 
paptfpssgipilppppqasgfprsrsssoaeatlgpestsnal 

SWLDKELLCLGLADPAPNVFFKESAGNSOWHLLQREOSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPAIAPKVEPAVPGHHGLALGNSALHKLDAL 
POLLEEAKVTSGLVKPTTSPLiPTTTPARPL^PFSTGPGSPLFO 
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ID 
NO: 


Prtd: cted 
becu.nina 
nuci totide 
lea t 3 on 
corresponding 
to first 
amine acid 
residue cf 
ana no acid 
seove.Mce 


Predicted end 
nucleotide 
local: ion 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid sepment containing signal peptide -- 
(A-Alanine, (^Cysteine, D=Arpartic Acid, E~ 
Glutamic Acid, F=Pheny lalan; ne , G=Glyci/it, 
H=Hist jdine, J-Isoleucine, K--Lysint, 
b=l.eucine / ^Methionine, N=Asparaginc , 
P=Proline, 0=Giutamine , R=Arginme, 
S«=Serine, ^Threonine, V=Valint, 
WoTryptophan, Y^Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








PLSFOS0GSPPKGPELSIASIKVPLES1KPSSALPVTAYDKNGF 
RILFHFAKE'CPPGRPDVLVWVSM1J4TAP1.PVKSIVLQAAVPKS 
MKVKLOP PSGTEI.S P FS P I QP P AA I TOWIXANPLXE KVRLR YK 
LTFAJLGEQLSTEVGEVDQFPPVEQWGKL 


6799 


3694 


16S6 


STISWESLESWLKKATNPSNROEDWEYI3GFCD02NKELEG*VS 
ALWGObRGSGLGRGTTMAKEGOPGSPRLSAbECVLLVPQXPQlA 
VRLLAKKI 0SP0EWEALQALTY LGDRVSEKVXTXVI ELLYSWTrt 
AliPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLlPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLOEANKLIKSMVREDEARIOKV 
TKRLHTLEEVNNKVR1.LSEKLLHYSQEDSSDGDRELMKELFD0C 
ENXRRT1.FXLASETEDNDNSLGDILOASDNLSRV3NSYKT31EG 
QVINGBVATLTLPPSEGNSOCSNOGTLIPLAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRBRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLAPPAPNVPPKESAGNSOWHLLOREOSDLDFFS 
PRPGTAACGASDAPLLOPSAPSSSSSOAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPAlAPKVEPAVPGHHGIjALGNSAl.HHLDAJL 
DQLLEEAKVTSGLVKPTTSPLJ PTTTPARPLL.PFSTGPGSPLFQ 
PIS FQSOG SP PKG PE LSLAS 1 HVPLES 1 K PS S ALPVTAYDKNG F 
RILFHFAKECPPGKPDVLWWSMLNTAPLPVKSIVLQAAVPXS 
KKVKLQPPSGTELSI-FGPIOPPAAITOWLLANPLKEKVRLRYK 
LTFALGE0LSTEVGEVDOFPPVEQWGKL 


6800 


4 04 


1646 


RRSPSTGLSPVPCTSS PSI.SDYS2 PWSLLLSGTIAWATPGK* AG 
• PQAW* LGLAPAl AK 1 / Gl/TRGR K.QN KEKMAF.GGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKS1 WYPPWARIGO'EAGTRARARARA 
RATRARRAV0KRASFNSDDTVLSPOEL0KVLCLVEMSEXPY1LE 
AAXIALGNNAA.YAFNRDI IRD1,GGLP3VAX3 1.NTRDP2 VKEXAL 
I VLMNLSVNAEN0RRLKVYMNQVCDDT1 TSRLNSSVQLAGLRLL 
7NMTVTNEYQHMLAMS I SDFFRLFSAGNEETKJbQVLXLIiLNLAE 
NPAMTRELLRAQVPSSIjGNSLFKKXENKEVILKLLVIFENINDN 
FKWEENEPTO>3QFGEGSLFFFLXEFOVCADK\'LG3E£HHDFLVK 
VKVGXFMAKLAEHMFPKSQE 


6801 




17SS 


S A EE FES QOA S VTMHD VDAES FEVLVDY C YTG RVSLS EAN VERL 
Y AAS DMLQI .E YVR EA CAS FLAK R LDLTNCTA 3 LKFAD AFG HR KL 
RSOAQSYIAONFKOLSHMGSIREETLADLTIjAOLLAVLRLDSLD 
VESEOTVC1IVAV0WLEAAPKERGPSAAEVFXCVRWMHFTEEDQD 
YLEGLLTXPIVKXYCLDVIEGALOMRYGDLLYKSLVPVPNSSSS 
/R*Q0OLSCICSRKSTPETGYVCOGDGDLLWTPORSI/S\RYDPY 
SGDIYTMPSPLTSFMITXTVTSSAVCVSPDHDIYIAAOPRKDLW 
VYKPAONSWQOLADRLLCREGMDVAYI/NGYIYJLGGRDPITGVK 
LKEVECYSVORNQWALVAPVPHSFYSFELIWONYLYAVNSKRM 
LCYDPSF'NMWLNCASLKRSDFOEACVKNDEI YCICD1 PVMKVYN 
P ARGEWRR I SN I PLDS ETHNYQ 1 VNHDQXkLL ITSTT PQWKKNR 
VTVYEYDTREDQW INI GTMLGLLQFDSGF J CLCARVYPSCLEPG 
P<5 FT TF"F DD AR C <?TFWT1T iDfiF*? F I J5SESGS SSSFSDDEVWVO 

VAPORNAQDQGGSL 


6802 


3.5*/ 


1341 


ETF?LPFFI»LSXTPGXTASMAKFVQGTSRM1 AAESSTEHKECAE 
PS TR KNLMNSLEOKI R CLEKQR XELLEVNQQWDQOFRS MKELYE 
R KVAEX.KTKLDAAER FLS TR E KDPHQRQU KDDRQRED DRQRDLT 
RDRLOREEXEXERLNEELHELKEENKLLKGKNTIjANKEKEHYEC 

El KRLUKALQDAIjNI xcsfsedclrksrvefcheekrtemevlx 

QQV0IYEEDFKXERSDRERLNQEXEELC01NETS0SQLNRLNSQ 
IKACGMEKEKL5KCLK0MYCPPCNCGLVFHLOPPWVPTGPGAVQ 
X0REHPPDY0WYALD0LPPDV0HKAN/DWCLAPPPVCCQAG/PR 
TPGLK^^SCLWLPXC^NFRFILSKESPSVEVHTNRERQQATRER 
G 


6603 




2203 


KLSGRPYRHhK5VU=TSXl>YDlRKTIFTFTP0FIDCXX?FYLALDN j 
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SEQ 
3D 
NO: 


Preui ct ec 
bcoinninc 
nucleot ice 
location 
corresponding 
tc first 
amino acic 
residue ci 
amino acic 
sequence 


Predicted end 
nuclect ide 
locaticr. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino scid seoment cont.cinino sacnal peptide 
(A=A3aninc, OCysteine, D=Asporfic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
Hsllistidine, 3=Isoleucine , K-Lysint, 
L=L>eucine, M-Methicnine, N-Asparaaine , 
P^Proline, 0/=Glutamine , R«=Aroinir,e , 
S=Serine, T=Threonine, V=Ve3ine, 
Wc Tryptophan, Y*= Tyrosine, X=Unknowr. , * = Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








K/«1IVEMLKTDLSYLCSRWKMTG0?TITFP1SHSMLDEDGTSLNS 
S 1 LAALHKMODGYFGGARVOTGKLSEFLTTS CC1KLS FMDPGPE 
GKLYSEDYDDNYDYLESGNVJMNDYDSTSKARCGDEVARyLDKLL 

>rVHKYLPTKLFOASRPSFNl,LDS?HPROEN0VFf;VRVEIHLFRD 
OSGEVDFKALVLOLKETSSLOEQADILYMLYT.YKGPDWNTELYN 
FRSATVRELLTELYGKVGE 3 RHWGL3 RY3SG3 LRKKVEALDEAC 
7DLLSHQKHLTVGLPPEPREKT3 SAPLPYEALTCL3DEASEGDM 
S j S 1 LTOE I MVYLAhJ YMRTO^GLFAEM FR LR I GLI 1 QVMATELA 
HSLRCSAEFATEGIMtt>SPSAMKNLLHK3 LSGKE FGVERX/ SVR 
F TD S NVS P A 1 S I H E 1 G A VG AT KT E R TG 3 MO L KS E 3 KQV E FRR LS 

1 SAibUb 1'\j> . iyMI Fl>btji* r roAI L^yooftLoKVv-y wyKrCKKJulXj 

A1JJRVPVC-FY0KWKVL0KCHGLSVEGFVLPSSTTREMTPGE1K 
FSVHVES\V3 J NVLI J RPEYRCIiLVEAILVLTMLAI)3EIHSIGSll 
A VE K3 VH 3 ANDLFLQEQKTLGP \ DDTMLAKD PASG \ I CTLR \ Y D 
EAPSGRFGTMTYLS\RAA\ATYV0EFLP\RS3CAMQ 


6804 


3 


9S1 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDVWKS 1 CGOWPKKTLKELFfJDSDTEAAAS P PH PA PEEGVAEES 
LQTVAEEESCSPSVELEKPPPVNVDSKPir.r.KTVXVNDRKAEFP 
SSGSNFSA*3PLPYLHLNRLHCSL*0KGSR00S£VTVSEPLiAPN 
OEEVRS3KSETDST3EVDSV/iGELQDLQSERE* LASRF* CQCfc'b 
KC* * SARTRTS* KSLYRSEKSERCSGNRK F3 KKARKKP* SNSGK 
QOXEGKRHK 






206 


R0PDLKYFGKSFDVSVSESSSLLSNDLPKFADG3KARWRN0NYL 
VPSPVLR1LDHTAFSTEXSAD3V1CDEECDSPESVNQQTQEESP 
3 E VHTAEDV P I AVEVHAI SEDYB3 ETENNSSESLQDQTDEEPPA 
KLCKILDKSOA^NVTAQOKWPLLRANSSGLYKCEbCEFNSKYFS 
DLK0KMILKKKRTC£IfVCRVCKESFSTNMLL3ErlAKLHEEDFYl 
CKYCDYKTVIFENLSOHIADTHFSDHLYK'CEOCDVOFSSSSELY 
LHFQEHSCDEOYLC0FCEHETNDFEDLHSKVVNEHACKL3ELSD 
K YNNG EHC-0 Y SLLS K 3 TFDKCKN FFVCOVCG FRS RLHT1WNRHV 
A3EHTK3FPKVCDDCGKGFSSKLE\lAKHLNSHi J S'EGIYI,COYW 
EYSTGQ1EDLKIHLDFKHSADLPHKCSDCLV.RFGNERELISKLP 
VT4ETT 


6806 


212 


3794 


VAXCFPNSDPVMFMDAFYGCLLAELGPVP3 EVPLTRKDAGSQQV 
GFLLGSCGVFLALTTDACQKGLPKAOTGEVAAFKGWPPLSWLVI 
DGKHloAKPPKDWHPLAQDTGTGTAY 3 2YKTSXEGSTVGVTVSHA 
S LUXQCRALTOACGYS EAETLTNVLDFKR DAGLWHGVLTS VK>JR 
MHWSVPYAIjMKANPLSW 3 QKVCFY KARAALVK S R DMHWSLUVQ 
RGORDVSLSSLRML3 VAIX3ANPWS 3 SS CDAFIJJVFOSRGLRPEV 
1 CPCASSPEAiTVA3RRPPDLGGPPPRKAVLSIWGLSYGV3 RVD 
TEEKl^VT.TVQDVGOVMPGANVCVVKLEGTPYLCKTDEVGE 3 CV 
S SS ATGTAYYGLIiG3TKNVFEAVPVTTGGAP I FDR FFTRTGLLG 
F 3GPDHLVF3 VGKLDGLMVTGVRRHNADDWATALAVEPMXFVY 
RGRIAVFSVTVLHDDRIVLVAECRPDASEEDSF0WMSRV1>0AID 
S I HOVGVYCIaALVPANTLPKAPLGG 3 H I SETKQR FLEGTLHPCN 
V LMCPHTCVTNLPKPROKOPEVGPASM 1 VGNLVAGKR I AQASGR 
EI^LEDSDOARKFLFLACVLQWRAliTTPDHPLFLLLNAKGTVT 
S TATC VQlii KRAE RVAAALME KGF L S VG DHV ALV Y P PGVDL I AA 
FYGCLYCGCVPVTVRPPHPONI^TTLPTVXMIVEVSKSACVLTT 
OAVTRIJLRSK EAAAAVD3RTWPT 3 LDTDDI PKKK 3 AS VFRPPSV 
DVlAYLDFSVSTTGILAGVKflSHAATSALCRS 3 KLQCELYPSRQ 

3aicldpycg1x5falwclcsvysghqsvlvppleleshvslwls 
avsoykarvtfccysvmemctkgix;aotg^rmkgvnlscvrtc 
kwaeerp\r i altqsfs k l fxdlglparavstt fgcrvnva3 c 
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S50 
JV 

NO: 


Free* ctec 
bee jnn i no 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca tier, 
corresponding 
to first 
ana no acid 
residue of 
amino acid 
sequence 


Amine acid secment containing signal peptide 
(A-Alanine, C-- Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, P=Phenyi alanine, G=Glycme, 
K=Histicine, 3= Isoleucine , X=Lysine, 
L= Leucine, M-Methionine, N^Asparagine , 
F=?roline, Q=Glut amine , 3=rArginme, 
S=£erine, T=Threonine, V^Vejine, 
W=Tryptophan, Y«Tyros>ne, X = UnJcnown, *=Stop 
Codon, /^--possible nucleotide deletion, 
\=possiblc nucleotide insertion) 








LOGTAGPDPTfVyVDMRAljRHDRVRLVKKGSPlLSLPLMESGKIL 
PGVKVI1AHTETKGPLGDSHLGKIWVSSPHNATGYYTVYGSEAL 
HADHFSARLSFGDTQT1WARTGYLGFLRRTELTDASGGRHDALY 
WGSLDETLELRGMRYHPI VI ETSVIRAHRS 1 AECAV FT WTJNLL 
VWVELDGL-EQDALDLVALVTN WLEE KY LWG WV J VDPG V I P 
INSRGEKQKMHLREGFLADQLDpI YVAYNM 


6 807 


1444 


60C 


VGKDTVKAf»lFTCFPKCLGFSPP\WTV£PRSEESHTTTVSGGNG 
5VFOAGF0LOALANLEAKRGE 1 GAALSSRDVSGLPVYAQSGEPR 

R ltqaqv aa f pgen alehs sdqdt wds LRS PGFCS pls SGGGAE 

SLPPGGPGHAEAGHLGKVCDFliLNHQCPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPOPGPHGSLLTEGCLRSLSGDLN 
RFPCGME\^SGQRELESWAVGEAMA\lKFPMGAMSYCLRDRSR 
FLFRLPMG LS CPLCVQ 


6808 


206? 




GVGSGAASALARSRPLASRLSERRRTRAPRSGAMQRliAMDLRKL 
SRELSLYLEHQVRVGFFGSGVGLSLILGFSVAYAFYYLSSIAKK 
P01.VTGGESFSRFLODHCPW7ETYYPTVWCWEGRGQTLLRPF\ 
3 TS KPPVOY RNELI KTADGGQ 1 SLDWFDNDNS TCYttDASTRPTI 
LLLPGLTGTSKESY1 LHMTHLSEELGYRCWFNNRGVAGEN1.lt 
PR T Y CCANT F DLET V 1 1 U I VI IS I . Y P *? AP FI AAG VSMGGM I , LLNYL. 
GKIGSKTPLMAAATFSVGWNTFACSESLEKPLNWLLFNYYLTTC 
LOSSWKHRKMFVKCVDMDI-T^'jKAKSIREFDKRFTSVMFGYQTI 
DDYYTDASPS PRLKSVG 3 PVLCLNS VDDVFS PS11AI P J ETAKQN 
PNVALVLTS Y GGH J GFLEG I W FRQS TYMDR V FKQFVQAM VEHGH 
ELS 


6809 


939 


6E 


DYSGQTPVPTEHGMTLYTPAOTHPEQPGSEASTOPIAGTOTVPQ 
TDE7\J\QTDSQ P LH PS DP TE KQQ PKR LHVSN I P FR FR DPDLRQMF 
GQFGKILDVEI 1 rNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGR K I EVNKATAR VMTNKKTGNP YTNGWKLNP WGAVYGPE FYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYTJTFRAAPPPPP 1 PTYG 
AVVY ODGF YG AE I \ 1 .EATOPTLn-l.S PLORROPTATVTAE STQLP 
TRTi TPSGPRR PTALEPCETFHRFLLGP 


681U 


93$ 


6b 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPG; 
TDEAAQTDSOPLHPSDPTEKOOFKRIjHVSNIPFRFRDFDLROMF 
GQFGKILDVEI I FNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
XrrGFPYPT^rGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPIPTYG 
AVVYODGFYGAEI \LEAT0PTDTLS PLCKROPTATVTAESTQLP 
TRT I TPSGPRR PTALEPCETFHR FLLGF 


6811 


1522 


65t 


DLVTVWSFVDCRV1ASTHGH\KSWVSWAFDPYTTSVEEGDPME 
FSGSDEDFODLLHFGRDRADSTOCRLSRRNSTDSRPVSVTyRFG 
SVGQDTOLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGA1ASGV 
SKFATLSLHDF KERHHEKDHKRNHSMGK I SSKSSDKLNLVTKTK 
TDPAXTLGTPLCPRMEDVPLLEPL1CKKIAHERLTVLIFLEDCI 
VTAC0EGF1 CTWGRPGKWS FNP 


6812 


4001 


168: 


EDAVFSLDLSTI IOGTWFLNGEELKSNEPEGgVEPGALRYRI EQ 
KGLOHRLILIIAVKHODSGALVGFSCPGVODSAALTIOESPVHIL 
SF^DKVSLTFTTSERVVLTCELSRVDFPATWYKDGOKVEESELL 
WKMDGRKH^LILPEAKVODSGEFECRTEGVSAFFGVTVQPPPV 
HI VDPREHVFVHAI TSECVM1»ACEV\DR \EDAPVRWYKDGQEVE 
ESDFVVLENEGPHKRLVLPATOFSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKATYVAAVRLERVArLTCEW^PWAEVRWTKDGE 
EWES PALLLO KEDT VRRLVLPAVQLEDSGEYLCE I DDESAS FT 
VTVTEPPVRI 3 YPRDEVTLI AVTLECWLMCELSREDAPVRWYK 
DGLEVEESEALVLERDGPRCRLVIjPAAQPBDGGEFVCDAGDDSA 
FFTVnVTEPPVOFLALETTPSPLCVAPGEPVVLSCELSRAGAPV 
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BNSDOCID: <WO 01S3312A1. r 



WO IJJ/53312 



PCT/i:S00/34263 



r SEO 
3D 
NO: 


Predicted 
beoir.ni n° 
nucleot ide 
location 
corresponds ng 
to first 
amino acid 
residue of 
amino acid 
spq\ience 


Predict ec end 
nucleot i d- 
locat ic: 
corresp'or-cir.,a. 
to firs' 
amino ao(' 
residue c: 
amino acu 
sequence 


Ammo acid c-eoxent containing sicnal peptide 
(A-Alanine, OCyoteint , D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a ianir.e , G -Glycine, 
H=Histidine, 3 = 3soleucir.e . K=Lysme, 
L=Leucine , M-.Methionine , N=Asparauine, 
P=Proline, QcGlutamine , R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y-Tyros>ne, X^Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=*possible nucleotide insertion) 


I 






VWSHNGRPVOEGiGLELHA5£GPRF.VLC2QAAGF7*KAGLYTCOSG 
AAPG APS 1& F T VQV AE P P VR W A F E AAQTR VR S T PGGDLE LWH 
LSGPGGPVRMYKDGERLASOGRVCLEOAGAROVLRVQGARSGDA 
GEYLCDAPOD^RIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFK 
CE VS P P DAD VT W LR NG A VVTPGP CP QS CCS YGG CRM CGORKART 
CVS KWRQAEWVQRGPC AGCEVGf. PC P7TLAC P\S PRMGTSTAS SS 
MVSY WPTRAPTAARATT I AP WPG S A 


6 813 




0 J t 


ccTnnc Pfi\/DZ4f:DPDT.TV^VT.f^:vz.riPKDT WMurL"irai UTccfui 
id lyyKrovrrtUrKriJiAj I LAj V/wri f\ r JUrj^lnV- K-JL-/\1j V IsisL^JlL* 

LHSROGSOIDQTECV3RMNDAPTRGYGRDVGNRTSLRVIAHSS3 
QR3 LRNRHDLL1JVS0GTVF3 FWG PSS YMRRDGKG0VY1WLHLLS 

TMT3ALELCDR3NVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSH}!RF3TEKRVFKJ>!WARTFN1HFFQPDW:<PES 
1A1NHPENKPVF 


6814 


3 


73: 


KFRR0EAN/AKERNRMHGLNDALDNLRKVVPCYSKTQKLSK1ET 
LA KN Y 1 WALS E 3 LR 1 G KR PDLL? FVQNLCK G LSOPTTNLVAG 
CLOLNARSFLMGOGGEAAHHTRSFYSTFYPPYHSPELTTPPGHG 
TLDN S KSMK-P Y N Y CS AY t S rYE t: T S P E CAS PQ F ^G P1»SPPP1NY 

ng3fslkqeeti,dygknynygw>:ycavpprgpi.gogamfrlptd 

SUFPYDbllLRSOSLTMODEIiNAVFHN 


6615 


906 




ogldpasotkwellkdgsgrrgdrrssrdmaggagprsesdle 
dvgptaewngdg 5gslrr sgsfc- k3 .rdalrrs semi a/kklqgg3 

PQEPPNPRMKP.ASSbNFLNKSVEEPTQPGG 


6816 


1 


ec: 


NLLKTHKF\LLG0DED£LHSVPVA0MGl^Y0EYLKTiJ^SPLRE3D 
PDQPKRL.HTFGNPFK0DKKGMM3 DEADEFVAG PQNKVKRPGEPN 
SPMSSKRRRSMSLLtRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LIKPTLVHTDATIIHDGHEEKMENGQ3TPDGFLSKSAPSEL3Nh5 
TGDLMP PNQVDS LS DDFTS LS KDGIi 3 OKPGSNAF VGGAKNCS LS 
VDDOKDPVASTLGA^PNTLQl TF AMA0G3NAD : KHOLMKE VR KF 
GRSK 


6817 


172 


345', 


LGMMDS PK3 GNGLP VI GPGTD3 G 1 S S LHMVGYLG KNFDSAXVPS 
DEYCPACKEKGKLKALKTYR 3 SFOES 3 FLCEDLQC3 YP1X5S KSL 
N>lL3SPDLEECHTPHKPOKRKSLFSSYKI^U J L l ANSKKTRNY3A 
3 DGGKVLNS KHNGEVYDETS SNLF C S SGQQNP 3 RTADSLERNE 3 
LEADT VDMATTKDPATVDVS GTGR P S PQNEGCTS KLEMPLES KC 
TSFPOALCVOWKNAYALCWLDClLSALVHSEELKNTVTGljCSKE 
ES I FWR LLTK YNQANTLLYTSOL FG VKDGIXK K LTS KI FA E 3 ET 
CLNEVRDE3 F3 SLQPOLRCTLGDKES PVFAFPLLLKXETH 3 EKL 
FLYSFSWDFECS0CGH0Y0NRJJMKSLVTFTNV3PEWHPLNAAHF 
GPCNNCNSKSQ3 RKMVLEKVSP1 FMLHFVEGLPONDLQHYAFHF 
EGCLY03TSV3QYR.ANNHF3TW3LDAIX5SWLECDDLKGPCSERH 
KKFEVPASE1 H 3 V3 WERKI SQVTDKEAACLPLK KTNDQHALSNE 
KPVSLTSCSVGDAASAETASVTHPKD1SVAPRTLS0DTAVTHGD 
HLLSGPKGLVDN33>PLTLEETIOKTASVSQLNSEAFL\LENKPV 
AENTG3LKTNTLLS0ESIMASSVSAPCNEKLI0D0FVDISFPS0 
VVNT^QSVOLNTEDTWTKSVNKTDATGLIOGVKSVEIEKDAQ 
LKQl^TPKTEOLKPERVTSQVSNIKKKETTADSOTTTSKSLONO 
S LKENQKK P FVG S WV KGL I S RG AS FM PLCVSAKK RNTI TDIjQ P S 
VKGVIWFGGFKTXG 3 NQKASHVS KKARKSASKF P P 3 SKPPAG PP 
S SNGTAAHPHAHAAS EVLEKSGSTS CG AQLNHSS YGNGI SS ANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSSPOSRTVRSENLE 
0VPQDGSPNDCES3 EDLLNELP YPI DIANESAC7TVPGVSLYSS 
QTHEE 3 LAEL1>S PTPVSTELS ENGE GDFRYUGMGDSHI PPPVPS 
EFNDVSQNTHLRODHNYCS PTKKN F CE VQPDS LTNNACVRTLNL 
ESPKKTDI FDEPFSSS ALNA1ANDT LDLPHFDEY LFEtfY 


6818 


2 


24 L 


RGFDKVLWT/LSGAVK\CVQFSRI S PDGEEGYPGELKVWVTYTL 
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ID 
NO: 


Predicted 
beginning i 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


PrF'd.:* c.tec end 
;u: elect i ce 
1 oca tier, 
corresponding 
to first 
amino acid 
residue ot 
amino acid 
fequencf- 


Ammo <;cid ttfiwer.t containing clonal ptptide 
(A=Alsmnfc, C^Cysteine, D=Aspartic Acic, E = 
Glutamic Acid, Phenylalanine, G=Glyc:tne, 
H^Histicine, 1 - 1 sol eucine, X*Lysir.e, 
L=Leucine, tf=Me thionine , N=Asparagine , 
P=Proline, 0=Glut amine, R=Arginine, 
S=Serine, T-Tnxecnine , VsValme/ 
W=Tryptophan, Y := Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 






DGGE/U{£/ATTEHKP/VQATPVNLT\TJLTSrW0ARLPOl , 


Obi? 


1 


963 


G 1 PCTE KG N F DN ANVTG E 1 £ FA I HYCFXTHSLEI CI KACKNLAY 
GEEKKKKCWPYVKTYLI'PDRSSOGKRKTGVO/WrVDPTrOETLK 
YQVAPA0LVTRQLQVSVWHU5TUARRVFLGEVIIPLATWDFEDS 
TTQSFPWKPLSAKADKYEDSVPOSNGELTVRAKLVLPSRPRKLQ 
EAOEGTDOPSLHGOLCbWLGAKNLPVRPDGTLNSFVKGCLTLP 
DQOKLRLKSPVLRKOACPOWKItSFVFSGVTPAOUROSSLELTW? 
DQALFGMNDRLLGGT\RLGSKGDTAVGGDACSOSKLOWOKVLSS 
PNLWTDMTLVLH 


6e20 


1014 


340 


GDMVY1VGHVPPGFFEKT0NKAWFREGFNEKYLKWRKHHRV1A 
GQFPGHKHTDS FRMLYDDAGVP J SAMF1 TPGVTPWKTTLPGWN 
GANNPAI RVFEyDRATLSLKDMVTYFMNLSQANAGGTPR WELEY 
OLTEAYGVPDASAHSMHTVbDRlAGDOSTLQRYYVYnSVSYSAG \ 
VCD EA CS MQ H VCAM RQVD I DAYTTCL Y ASGTTPV PQ L ? LI »L»MAL ; 
LGLCT 


6821 


10 88 ■ 51*- 
1 


efdiyr/evggefvpvtrddssn(;fprtohgpsptvhpiospon 
rfcvltldpetlpa3attlidvlfyshstpkeaassspepssit 
FFAFSL3EGYJ vsivmdaetokkfpsdllltsssgelwrkvrig 

GOPbGFDECG 3 VAC 1 AG PLAAAD1 S AY^' 1 STFNFDHAIiV PEDG3 
CSVIEVLORR0ECLAS 


6622 


lose 


Sit 


EFDJ YR/EVGGEFVPVTRDDSSNGFPR1 0HGPSPTVHP2 OS PON 
RFCVLTl.DVETbPAlATTLlDVTjFYSHSTPKEAASSSPEPSSlT 
FFAl'SLIEGYI \ SI VMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPbG FDECG 1 VAQJAG PLAAAD1 S AYY ISTFNKEHALV PEDGI 
GS V I EVl^RKOEGLAS 


6 823 


654 


221 


PPKLLSRWAKMGKGDE3 V\LSDLNFPGI>LHLPVVGPWRSVOTAC 
GlPOLLRAVLKLLPLDTYVESPAAVMELVPSDKERGLOTrV"WTE 
YES1 LRRAGCVRAUAK1 ERFEFYEKAKKAFAVVATGETAbYGtf L 
ILRKGVLALNPLb 


6824 


858 


1(H 


LLLAURWG WG \ CCFFSLAVSVKMNVLLFAPGLLFLLLTOFGFRG i 
Al.PKLGl CAGLOWbGLPFLbENPSGYLSRSFDLGROFLEHWTV i 
WWRFLPEALFEHR/lF^LALLTAHLTIjLLLFALCRWIIRTGESILS 1 
U.RDPSKRKVPP0PLTPN0IVSTLFTSNFIGICFSRSLHY0FYV | 
WYFHTLPYll,VIAJ«1PARWLTHLLRl J bVLGLlELSWMTYPS*:SCSS i 
AALH 3 CHAVI LLOLWLGPOPFPKSTQHS KXAH 


6625 

1 


3 


1173 


SSGEFGLOASD 1 MWTI S DTGWIL1 1 LCS LMEP WAbGACTF VHLL 
PKFDPLVTLKTLSSYP1 KSMMGAP1VYRMLLQODLSSYKF PHLO 
NCLAGGES LLPETLENWRAQTGLD1 RE F YGQTETGLTCMVS KTM 
KJ KPGYMG TAAS CYDVQ 1 1 DDKGNVL P PG TEGD J G I R VK P I R PI 
GI FSG YVDNFDKTAAN1 RGDFWLLGDRG IKDEDGY FQFMGRADD 
IINSSGYR 1 GPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VI LAL0 FLSHDPEQLTKELQQHVKSVTAPYKYPR K I EFVLNLPK 
TVTGKI ORA\ KLRDKEWKMSGKAPCAVRHLRDl KbDSPLLSbSF 
PFGPLAIjPMDGYGDSLWEEHEYKFCLALVISTKLYHVRC 


6826 


2304 


954 


LKTESFKPW/VNI ALAFKLLGERASPNSPWQPyiOTLPREYDT? 
LYFEEDEVRYL0STQAIHDVFSOYKNTARQYAYFYKVICTHPHA 
NKLPLKDS FTYEDYRWAVSSVMTRQN01 PTEDGSRVTLALI PLW 
DMCJNrKTNGLITTGYNLEDDRCECVALODFRAGEQl YI FYGTRSN 
AE FVI HSG FFFDNN SHDR VK I KLG^S KS DRLYAMKAEVLAJ^AG I 
PTSSVFALHFT EP P I SACbLAFLRV FCMTEEELKEHLLGC S A I D 
RI FTLGNS EFFVSWDNEVKLWTFLEDRASLLLKTYKTTI E EDKS 
VLKNHDLS VRA KMA I KLRLGE KEI LEKAVXSAAVNREY YR QQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVODALNI 
RSA I S KAKATENG L VNGENS I PNGTRS ENESLNQE S KRA V EDAK 
GSSSDSTAGVKE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
ccrresoond: rig 
tc first 
arr.ino acid 
residue of 
amino acic 
sequence 


Predict ee er.b 
nuciect ide 
location 
corresponding 
to fixsl 
amino acid 
residue of 
enino acid 
sequence 


Amino aclc" segment containing signal peptide i 
(A=Alanii;t , C*C/steinc f D^Aspartic Ac:d, E=r 1 
Glutamic Acid, F^Phenyl alanine , G-Glycine, 
H«H.ist idint, 1 = ~- scleucine , KsLysine, 
L= Leu cine, M=Methionine, N=Asparagi ne 
P^Proline, Q-Cl utamine, R=Arcimn€., 
s=serine, r= Threonine, V-Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, '-Stop 
Codon, /=possibie nucleotide deletion. 
Vpossible nucleotide insertion) . | 


6827 


1 




SSWEFGLSVUiGLFLLFVLENMLGLLRKRGLRPRCCRRKRRNb 
ETR^DPENGSGMAJLOPLOAAPEPGAOGOREKNSCHPPALAPPG 
HQGHSHGHQGGTD1 ' W f<V LLG DG LHN LTDG LA I G AA FS DG FS S G 1 
LSTTLAVFCKF.LPH E LGDFAMLLQSGLS FRR LLLLSLV SGALGL 
GGAVLGVGLSLGPVFLTPWVFGVTAGVFLYVALVLMLPALFPSS 
GAPAYA\HVLLQGLGliLLGGCLMLAJTLLEERLLPVT7EG 


6026 




I6b4 


KSOHG/WILObMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRFSE 
LGHLSCTASLKRGSSFOSGRDDTWRYKTPHRVAFVEKXTKLVLS 
QLPNFWKLWI £ YVNGS LFSETAEKSGQI ERSKNVRQROKDFKXM 
IOEVMHSLVKLTRGAi.I>PL£IRDGEAKOYGGVJEVKCELSGOWLA ! 
HAJQTVRLTHESLTALElPNDLLOTIQnLlLDLRVRCVKATLOH 
TAEH 1 KRLAEKEDW1 VDNEGLTSLPCQFEOCI VCSLQSL-CGVLE 
CKPGEASVF00PKT0EEVCOLSINIMOVF3YCLE0LSTXPDADI 
DTTKLS VDVS S PDLFG E I HEDFSLTSEQRLL I VLSNC CY LERHT 
FLNIAEHFEKHN t FOG: F.KnQVSMASLKELDORLFENYl ELKAD 
PI VGSLEPG1 YAGYFDWKDCLPPTGVRNYLKEALVN3 ] AVHAEV 
FTJ SKELVPKVLSKV: EAVSEELSRLMOCVSSFSKNGALQARLK 
3 CALRDTVAVY LTPF.f KSSFKQALEA.LPOLS SGADKKLLEEION 
KFKSSMHLQL'ICFOA^.SSTMMKT 


6029 






KRKEAGEAA PP AGAGG RAAGGWGKW VRLNVGGTV FLTT kQTLCR 
EQK S FLS RLCOGEE LOS DRDETGAY L. I DRDPTYFGP I LN FLRHG 
KLVIDKDMAEEGVLELAEFYH JGPL1 Rl 1 KDRMF.EKDYTVTQV P 
PKlIVYRVLOCCEEELTOf4VSWSDGWRFE0LVNIGSSYKYGSED 
OAEFLCVVSKELHSTFNGLSSESSRKTKSTEEOLEEQCCQEEEV 
EE VEVEOVQVE ADAGE K / CCYKPEAPGCEAPDHLOGLG V p I 


6830 




9? S 


KEPGS V ICNLS 1 VYRSPD FLWN KHWDVK 1 DS KAWRETLT LQKQL 
R YR F PELADPDTCYG FK FCHQLDFSTSGALCVALNKAAAGSAYR 
CFKERRVTKAYLALLRGKIQESRV7ISHA1GRNSTEGRAIITMCI 
EG S QG CEN P K P S LTDL W LEI 1G L Y AG DP VS K VLLK PLTG R THQL 
RV\HCSAlvGHPVVGDLTYGEVSGREDRPFFMMLHAFYLR I PTDT 
ECVEVCTPDPFLPSLDACWSPHTLIjOSLDOLVOALRATPDPDPE 
DRGPRPGSPSALLPGPGRPPPPPTKPPETEAORGPCLOWbSEWT 
LEPDS 


6831 




2 08'/ 


S LFFGS STPDN K VAE O/EDJLE TQPS PS VE XAVT V I D PEGT 1 PTNF | 
NVAEXPADHSLSSVKLKTADEPRGTLVKSGDGONVXEK^MILSN | 
VEDL00PKF1SEVSRLDYGKKEISGDSEEMKINSWTSADGENL j 
EI OS Y SL3 GEKLVMEEAKTI VPPHVTDSKRVOKPAI AP PS KWNI ' 
S 1 FKEEPRSDOKOKSLLS FDWDKVPQQPKSAS SNFAS KNITKE j 
SEKPESI ILPVEESKGSLIDFSEDRLKKEMONPTSLKI SEEETK \ 
LRSVSPTEKKDKLENR \ S YTL\AEKKVLAEKONSV\APLELRDS 1 
NE1GKTOITLGSRSTELKESKADAMPQHFY0NEDYNERPKIIVG 
SEKEKDEKKKK 


6832 


1809 




MGSGLISGPPCDNSGEAXKEPERAQEHSLPNFAGGQHFFEYliLV 
VSLKKKRSEDDYEPI17Y0FPKRENLLRG00EEEERLLKA1PLF 
CFPDGNS WASLTEYPR ETFS FVLTNVDGSRK 1 GY CR2 LLPAGPG 
PRLPKVYCI 1 SCIGCFGLFSKIXDEVEXRHQI SMAVI Y PFMQGL 
REAAFPAPGKTVTLICSFI PDSG7KPI SLTRPLDSHLEHVDFSSL 
LHCLSFEOIL0IFASAVLERKIIFLAEGLSTLSOCIHAAAALLY 
F FSWAHTY I P W PESLLATVCCPTPFMVGV0MRF00EVKDS PM2 
EVLLWLCEGTFLMSVGDEKDILPPKl^ODDILDSLGOGlKElrKT 
AEOINEHVSGPrVQFFVKIVGHyASYIKREANGOGHFOERSFCK 
ALTSKTKRFFVKKFVKTOLFSLFIOEAEK£KNPPAGYFOOKILE 
YEEQKKQ/ TETXGKNCE 2 RAWNKND 


6833 


i 


1129 


PLMTLS0CGG1 PGHGHSKGGHGHGHGLPKGPRVXSTRPGSSDIN 
VAPGEC^PDOEFTNTLVANTSNSNGLKIJ>PADPENPRSGDTVEV 
OVNGNLVREPDHMELEEDRAGOLNKRGVFLrJVLGDALGSVIVVV 
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SKQ 
TD 
NO: 


PredicLeo 
beginnmo 
nuclect 
Iccat : c." 
corresponding 
tc firr-t 
amino acid 
resicue cf 
amino acic 
seouencc 


Fredictec end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment cont a i nine signal peptide 
(A-Ala.nane, C^Cysteine, D«Aspdrtic Acid, K*s 
Glutamic Acid, F= Pheny: si an j ne , G-Glycine. 
H=Histidine, I=lsoleuc; ne, K- Lysine, 
L=Leucine, M=Methionim , N= Asparacine , 
P=Prolane, Q=Glut amine , R^Arginirit, 
S- Serine, T=Threonine, V^Valine, 
W=Tryptcphan, Y-Ty rosine ' XnUnknown, **Stop 
Codon, /-possible nucleoride deletion, 
\=possible nucleotide i.-iscruon) 






! NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVEI 1NSTHASVYEA 
j GPCWVLYLDPTLCWMVClLLYTTyPLLKESALILLOTVPKOID 
! IRNLIKELRNVEGVEEVHELHVV.'OLiAGSK] IATAH1KCEDPTSY 
| MEVAKT1 KDVFHNHGIHATT30PEFASVGSKSSWPCELACRTQ 
. 1 CALKQCCGTUPOAPSGKDAEKTP.hVSISCLEUSNKLEKKPKRTK 

: aenipa\wieikn\ipnk\qpfssl 


6834 


lb 


1151 I AGOERPAP3WRLLKLPTPSVSR>LAEPA>-:iF3NR*GA*E*RGGLP 
j LCGSSASAYGWH* RLTPWSPGG5? * HM * SS K A P VTCAREVLVAG P 
I CSKLVLSGARGIVGTTVOVLVEAOOPLLLLF'T'GVWGLNLRAGEB 
| SRAL* LIEEVT0VRDAHLGNAWGCAOCLS0GOVGSALAKAULE 
[ AAAAVRDCKEVLTVSGDKQOAEVSVSL* VRDVO/EEAGCVEFGQ 
j A}IGRPGLALAKGRGGTNEVEEQVQ\QGVQKLVLSAHECHELVAG 
j QQDGEDQAARTRLL0AGAHSVAHGRR0G0APCRPKOEAGVSCHE 
| LQOVVGD AL * ARE * APQ 1 I VLLLLF.DVA0LR7GKKA* DbWDVE 


6835 


j 


834 


CI PAADR\EASLELIKLDISRTFPNLCI FQOGGPYHDMLHSILG 
A YTCYR PDVG YVQGMS FI AAVL J LNLDTADA F I A FSNLLNKPCQ 
MAFFRVDKGLMLTYFAAFEVFFEENLPKLFAHFKKNNLTPDIYI/ 
3 DW J FTL YSKSLPL-DIiACR J WDV FCRDGEEFLFR TALGI LXLFE 
DJLTKMDF1HMAQFLTRI.PEDLPAEELFAS3ATIWQSRNKKWA 
Q VLT ALQKDS REITOEG KS VP PT L K LOR £ F Al<GTNCS PMP R PLCC 
FRLTPGOPRRTDAL 


6836 


} 


8b0 


MSCGRPPPDVDGMITLKV\DNLTYRTSP^SLRRVFEKYGRVGDV 
Y I PRE P.HTKAPRGFAF VR FHDRRDAODAF. AAMDG AELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYCRRSRSYGRRSRSPR 
RRJiRSRCRGPSCSRSRSRSRYRGSRYSRSFYSRSPYSRSRYSRS 
PYSR5RYRESRYGGSHYSSSGYSKSRYSRYKSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSP.SRSRSSSMTRSPPRVSKRKS 
KSRSRSKRPPKSPEEEGOMSS 


683? 


_ 




tdgaavagnpgsdyfpggtap7ggfrtrrp\sgtsssgska5gp 
pnppaogdgtslspnytlestsgkijgkpvsggggrgrgrrkrds 

GHVS PGT FFDKYSAAPDSGGAPGV S PGQQOASGAAVGGS SAGET 

rgaptphekaltspswgkgaelllodqpdblgsldggaksdsss 
pnvgefasdevstsyanedevssssdnpoalvkasrsplvtgsp 
klpprgvgagehgpkapppaujlgimsnststpdsygggggpgh 
pgtpgle0vrtptsssgapppde3 v.phel lcaoiolqroqfsis 

EDOPLGLKGGKKGECAVGASGAONGDSEIjGSCCSEAVKSAMSTI 
DLDS LMAEKS AAWYMPADKAL VDS ADDDKTLA P WE KAKPQNPNS 
KEAHDLPANKASASQPG SHLOCLS VHCTDDVGDAKARAS VPTWR 
SLKSDI SNRFGTFVAALT 


6838 


3 6 


499 


bTDT P P P KTHMI KHS 1SDY KATLF C WALG FY P ME 1 TLTWQQDEE 
DOTRDHELVETRPAGDGTFQKWAA\ , WPSGEE/0/RYMCHVQHE 
GLPEPLTLRWEQSSQPTI PI VG3 VAGLVLLGAWTGAWSAVMC 
RKKNSDRVSYSEAAS5DHAQGSDVSLTACKV 


6839 


-i 


1135 " 


AAPAGGGPDPEALSAFPGRHLSGLSWPQVKRLDALLSEPIPIHG 
RGh'FPThSVQPRQI RAGGPQHPGGAG \J HVHR VRLHGSAASHVL 
HPESGU3YKDLDLVFRMDLRSEASF0t,TKAWLACLl,DFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSWKSGKNVELK 
FVDSVRROFEFSIDSFOHliDSLLLFGOCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVI ATRSPKE 1 RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEORRTLERYLEAHFGGAD 
AARR YACLVTJbHRVVNESTVCLMNKERRQTLDL I AALALQALAE 
CGPAATAALAWRPPGTDG WPATVHY Y VTPVO PLLAHAYPTWLP 
CN 


€840 


42S4 


2061 


ELCGDFSV PDV PKSKAWCENS I CVG FKRDY Yh I RVDGKGS I KEX> 
FPTGKObEPLVAPLADGXVAVGQDDLTVVl.NEEGI CTOKCALNW 
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NO: 



6841 



Prc-co ct e-c 

nucleotide 
locat ion 
corresponding 
to first 
ami no acid: 
residue cf 
amino acic 
cecucucr 



"Predicted end 

I r.ucieotide 

I location 

• corresponding 

! to first 
amino acid 
residue of 

I amino acid 

: sequence 



2206 



Am: nc ncia sta^ent containing signal peptide 
w.= .Vi5nine, C=Cysteine , D=Aspartic Acid, E= 
Gii;:amic Acid, F= Phenylalanine, G=G3ycine, 
K->:: st i di ne, 3 =Isoleccine, Ksliysinc , 
L = L-fucine, M^Methicnane, N-Asparagi ne , 
P=Froline, Q=Glutamjne , R«Arcinine, 

Serine, T-Threonine , Vt=vei;nt, 
K="j ryptophan, Y-Tyro:;ine, X=Unknown, *=Stop 
Ccccn, /^possible nucleotide deletion, 
\ ---possible nucleotide insertion) 



"f? i>VAMEHQPPY 3 lAVLPK^ V E I RTF FFRLLVQS 3 ELQR PRFJ 
T5. GGSN 3 3 YVASNHFVWKL3 PV PMATQ3 QQLl^DKQFELALOLA 
EMK&2SES EK00C3 HH I KNLV AFNLFCQXR FDESKOVFAKLGTD 
PT r. VMGLY PDLLPTDYRKOLOV PNPLFVLSGAELEKAHLALa DY 
LT0KaS01-VXKLNDSDH0SST5PLMEGTFTIKSKKKLL0HDTT 
LI K C Y LKTNVALiV APL1 #R LENT* H CH 3 EESEHVLKKAHKYSEL3 3 
LY EK KGLHE KALOVLVDQSKK/-NS?LKGHERT*VOY LQHLGTENL 
KL: FSYSVWVIjRDFPEDGLX: fTEDLPEVESLPRDRVLGFLIEN 
FKGLA3 FYLEH3 IHW/EETGf RFHNC3 j 301>YCEKVOGLMKEYLL 
S F > AGXTPV PAGEEEGELGEYRQXL3-MFLE 3 SSY YDPGRL1CDF 
P? I'GLLEERALLLGKMGKHECALFIYVH 1 LKDTRMABEYCHKHY 
DRM KLGNKDVYLSLLRMYLSF PS 3 HCLGP; XLELLEPXANLQAA 
LO" V L ELHH S X LD TT KA LNLliF ANTQ 1 ND 3 R 3 FL»E KVLE ENAOK K 
R F N 0 V LKKLLHAE FLRV \QEE R 3 liHQOVKC 1 3 TEEKVCMV CKKK 
3GKSAFARYPNGVWHYFCS\KEVNPADT 



yp£r3GTKSNTPTSSVPSAAVTPLNSSLCPLGDYGVGSKNSKRA~ 
REKR DSRNMEVOVTOEMRNVf 3 GMGSSDEKSDVQD3 3DSTPELD 
MCPr.TRLDRTGSSPTOGIVNKJ'.FGlNTDSLYHEbSTAGSEVIGD 
VDIGADUX^EFSGMGKEVGNLLLENSQLLETKNALNVVKNDLIA 
KVDOLSGEOEVLRGELEAAKOAKVKL-ENRlKEbEEELKRVKSEA 
3 3A'-vKEFKEEAEDVSSYL.CTE£DKIPMACRRRFTRVEMARVLME 
RNOY KERLMEL0EAVKW7EM3 RASREHPSVQEKK.KST1WQFFSR 
LFS. 5 S SSPPPAKR P Y PSGN 3 H Y KSPTTAGFSQRRNHAMCP I SAG 
SRPLEF F PDDDCTS SARREQK? EQYRQVR E HVRNDDGRIvQACGW 
ST .P* K Y KOLS PNGGCEDTRMKWPVpvy CRPLVEXDPTMKLWCA 
AGVHLSGWRPNEDDAGNGVKFAPGRDPLTCDREGDGEPKSAHTS 
PEKK KAKELPEMDATSSRVWT LTSTLTT5K WI IDANQPGTVVD 
0F7VCKAJiVT J CISS3PAASDSDYPPGK^FLDSDVNPEDPGADGV 
LAG 3 7LVGCATR CNV PRSNCS £ R GDT P V1XKG0GE VAT3 ANGKV 
NPSCSTEEATEATEVFDPGPSEPETATLRPGPLTEHVFTDPAPT 
PS 5 G POPGSENG P E PES S STR ? E PEPSG DP7 G AGS SAAPTM WLG 
AONGV: LY VHSAVANWKKCLHS1 K1.KDSVLSLVHVKGRVLVAJLAD 
GTLA } FHRGEDGOWDLSNYHLM^LGHPKHS 3RCMAWYDRVWCG 
YKN K \r,{ VI 0 PKTMQ 3 EKS FDAK PRRESQVRCLAW I GDGVWVS I R 
LDFTLR LYHAHTHQK LQD VD I E PYVSKMLGTGKLGFS FVR I TAX 
LVj-.OS RLWVGTGNGW3 S JPLrETWLHRGONLLG\LRANKTSP 
TSGEG \ ARPGGN 3 3HVYG\DDS£DRAARSF1 PYCSMAQAQLCFH 
GHR LAV KFFVSV PGN VLATbNG S VLDS P AEG PGPAA PASEVEGQ 
KLRKVLVLSGGEGY 3 DFRIGDGEDDETEEGAGDMSOVKPVLSKA 
ERSKi:VWQVSYTPE 



6842 



926 



RCOCLSATILTDKOYLERTPLCAILKQKAPOOYRIRAKLRSYKP 
R R L F Q S VKLHC P X CH LLQE V PH EGDLD I J FQDG AT KTPD VKI/QK 
rSLYDSKlWTTKNOKGRKVAVHFVKNNGlLPLSNECbliLlEGGT 
LSE) CKLSNKFNSV3 PVRSGHEPLELLDLSAFFLIOGTVKHYGC 
KQWS7* R S 1 QNLNSLVDKTSW! ? SSVAEALG3 VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFX3 1 PASEVLMDDDLOKSVDMIMDMFC 
PPG 3 K 1 DAYPWLECF3 KSYNVTtfGTDNQI CYQ I FDTTVAEDVI 



6643 



NHRKVI.SGAKRYECNECGKSFAYTSSLIIO^RIHTGERPYECSE 
CGRSFAENSSLIKHliRVHTGERPYECVECGKSFRRSSSLIjQHQR 
VHTR E F ;P YECSECG KSFSLRSKl. 1 HKQRVHTSERHE CGQCGKSF 
SRKSSLIIHLRVHTGERPYECSUCGKSFAETCSSLIKHLRVHTGE 
RPYEC1DCGKSFRJ^SSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLL03QRVHTGSRCYECDKWG1 FFS*NASFr T* KSAPTEEVPFE 
CNECE KA FS PLSLVTTI FT 



6fi44 



"244 



642 



EHOLAG FELR KTQTSKSLGTTR EKTDR VKSTAYLS PQELEDVF Y 
0 YD VK S E I YS FG I VLWE I ATGD T P FQGCNS E KIR KJj VAVKRQQE 



561 



BNSDOCID:<WC .. 015331?A1_I.> 



WO 01/5331? 



PCT/US0<J/34 2<>3 



SEC 
It 
NO. 


Pr edictec 
beeinniny 
nucleot jdc 
location 
corresponding 
to first 
onuno acid 
residue ot 
arm no acid 
sequence 


Free ic tea one 
nucleotide, 
loco t ioi. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am:no acid secment containing signal peptide 
(A^Alanine, C-Cystei^e, r<=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=G3ycine, 
K-Histid:ne, 1 = 1 sol cu cine , Lysine, 
1 = leucine, M=Methion:ne, K=Asparagine , 
F= Proline, 0-Clxitamire, S^Arginine, 
Sr Serine, T-Threoninc, V=Voline, 
W« Tryptophan, Y=Tyrcsine, X^Unknovm, **Stop 
Codon, /^possible nucleotide deletion, 
\-p0s3ihie nucleotide insertion) 








PLGEDCPS ELREI J! DECRAKDPSVR PS VDEI LKKLS7FSK* CI K 
1 


684S 


3 


lblS 


VAVRDECV K R HV FX DQDLWMLLF I LKCK PETARARLB Y R I RTLD 
G A 1>EN AQN1.GYQGAK FAME S A£ SGLEVCPEDI YGVQE VH VNG AV 
GLAFELY Y HTTQDLQLFR E AG GWDWRAVAS FWCSR VEWS PREE 
K Y HLRGW.S PD EYHSG VNNS VYTNV LVCNS LRFAAALAODLGLP 
1 PSQ-WUWADKI KVPFDVECNFHPEFDGYEPGEWKOADWLLG 
Y P V P FSLS P DVRRKNLEI YEAVT S POGP AMTWSMFAVGWMELKD 
A VRARG LLDR S FANMAE P FKVWTENADGS G A VNFLTG MGG FLQA 
V\ r FGCTGFRVTRAGVTFDPVCLSGl SRVSVSGIFYCGNKLiNFSF 
SEDSVTVEV7ARAGPWAPHLEAELWPS0SRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGKTFSDVRDPLQSPLWVTLGSSSF 
TESLTVDPA£E*SGTGASETSIX5PSLWFRLHPPLLGTLLACHPS 
PAARIjSGKVHAAWPEFKAFCL 


6046 




1256 


LYFLKT3K* LNRLAEHP* YENEKLTKJLRNTIMEQYTRTEESARG 
31FTKTROSAYALSQWITENEKFAEVGVKA3IHLIGAGHESEFKP 
KTONEQKEV 1 SKFRTGKI NLL3 ATTVAEEGLDI KECN J VI RYGL 
VTKE I AMVO ARGRARADEST Y VLVAH SGSGV I EHET VND FREKM 

WV A TUl^DfVMIfDPrVMIIfT ! n riMAC 7 V,t?VVMIfTlfI3Kn 1VUVW 
r. J wu nt v^l^rif^fkAt. JAnRJ i_it.ijVr)^o 1 rjL,l\l\J v lM TMCTV l AJvni IV 

NNPSLITFLC KNCSVIACSGED1 HV I EKKKHVNMTPEFKE1»Y I V 
RENKTLQKKCADYOINGEIICKCGOAWGTMMVHKGLDLPCL.KTR 
KFVWFKNNSTKKOYKKWVELFITFPNLDYSECCLFSUED 


6847 


1450 


346 


SMCWNSDRl,EMPLlDLALTLYPPSYVPyTGHLSDDSl,£RKYCLT 
WFEDALNGV L» RAEA1 QPHCVMAGDRM.EKFRQKYWNKLCT1.R0Q 

FGWRSLDAi.GWHERQUVLVKGLLAGWVFDWGAKAVSAYLESDP 
YFGFEEAKRKl>OERPWLVDSYSEWLORLKGPPHKCAL.aFADNSG 
ID1 1 LGVFPFVRELLLRGTEV1LACNSGPALNDVTHSESL3 VAE 
R IAGMDPWH S ALREERLLLVOTGS S S PCLDLSRLDKGLAALVR 
ERGADLW3 EGMGRAVHTNYHAAIjR CES LKLAVIKNAWiiAERLG 
GRLFS VI FKY E VPAE 


6 846 


19 


16 


AMWlWSLDGlRNIVLSNPKKRNTLSlJVMLKSLQSDILJ-jDADSND 
LKVI I ISAEGPVFSSGHDLKELTEEOGRDYHAEVFQTCSKVMMH 
IRNHPVPV3AMVNGLATAAGCOLVASCD3AVASDKSSFATPGVN 
VGL FCSTPG VALARAVPRK VALEMLFTGEP I SAQEALLHGLLNK 
WPEAELQEETMRXARKIASLSRPWSLGKATFYKQLPODltGTA 
YYLTSQAMVDNliALRl^GOEGlTAFLOKRKPVWSHEPVVEH 


6 84 9 


70 


82a 


SLGVDGSCLEC<;5;PAPRP0TDTSP*PVGNWATQQEDLYH0SYEC 
VCVLFASVFDFKEFYSESNINHEGLECLRLLNBIIADFDELLSK 
PKFSGVEKI KT1 GSTYMAATGLNATSG0DAOQDAERSCSHLGTM 
VEFAVALGS KLDVI N KHS FNNFRLRVG LNHGP WAG V J G AQ XPQ 
YP3WGNTWVASRMESTGVLGKI0VTEETAWALQSLGYTCYSRG 
VI KVKGKGOLCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


ARGLNHEWTFEKLROHISRUAQOKQELHLFMbSGVPDAVFDLTD 
LDVJiKLELI PEAK I PAKI SOMTMLQEUILCHCPAKVEOTAFS FL 
R DH JL»R CLHVK FTDVA E I PAW V Y LLKN LR ELYTj I GNLNSENNKM I 
GLESLRELRHLKILHVKSNLTKVPSN3TDVAPHLTKLVIKNDGT 
KLLVLNSLKKMMNVAELE1>0NCELER 1 PHAIFSLSNLOELDLKS 
NN3 RTIEEU £ FQHLKRLTCLKLWHN K I VTI PPSITHVKNLESL 
YFSICNKLESLF VAVFSLQKLR CLDVS YNN I SMI P IEIGLLQNLQ 
HLHI TGNKVD1 LPKQLFKCI KLSTLNLGQNCI TSLPEKVGQLSQ 
LTOLELKGNCLDRLPAQLGOCRMLKKSGLWEDHLFDTLPLEVX 
EALN0DIN1PFANGI 


6851 


176E 


660 


VS AQ VS AREGENCLG WNLADSSQES Y K S LEEAEDCY P PSU/TLD 
LRDLFNQVEOGPLLSCPKAGTDLSMGRAREVGWMAAGbM I GAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDPDQELDEEEPDIWFDFET 



562 



BNSDOCID: *WO 01So31ZA1_I. : 



WO in 75335? 



PCI7USOO/34263 



SEC 

n 

no- 


Preca cted 
begmr.inc 
nucleot ide 
local, i on 
corresponding 
to first, 
amino acid 
residue of 
nmino acid 
sequence » 


pro a 3 cted < nd 
nuc". v.ot ice 
3 ocction 
cor i tspondi ng 
to first 
amino acid 
re? i due of 
amino acid 
sequence 


Amino 3c;ci segment contair.;nc signal ptptide I 
(A=Alanine, C=Cysteine, D-A£part ic Acid, E- 5 
Glutamic Acid, ^Phenylalanine, G=G3yc^ne, 
H=Kistidine, 1 = 1 soi f.ucine , K=Lysir.e, 
L=Leucine, M=MethiCrtine, K^Asparagine . 
p = Vroline, Q=Glutatr.ine , K^Arcinine, 
S^Serine, T=Threonine, V=Vc3ine, 
VUTryptophan, Y=Tyrosine, X=Unknown, » ^Stop 
Codon, /=possib3e nucleotice deletion, 
^possible nucleotide insertion) 








mafc pwtedgdvitepgapggtedrpsgggkanrahpl ko^pfpye 
hkntwsaqnckngscvldlskclfiogkllfaepkdagfpfsqd 
1nshlaslswarntsp7pdptvrealca.fdnlnas3 esqgq3 km 
yinevcketvsrccnsflcx?aglnll: smtvinnmlaksasdlk 
fpli5egsgcakvqvlkplkglsekpvlagelvgaqmlfsfmsl 

F J KNGNRE1 LLETPAP 


685/ 




4 07 


RTRGEETYANF3 KHNDGKN1 FYAARTP ATLFAVMFAMY 1 1 SGLT 
GF J GLNSI AVLCNLVMGLALI FLCTWAYVKYSGEFRE1 GTV3 DQ 
I AETLWEQVLKPLGDNLM5 E N I RQS VTNS 3 KAGLTDOV R HHAR L 
KTE 


6853 J 




469 


GDi'CAVCI EL YXPNDLVR J LTCNH1 FKXTCVDPWLLEHRTCPMC 
KCDILKALG1EVDVEDGSV5L0VPVSKE.1FNSASSHEEDNRSET 
AS SG Y AS VQGT YEP PLEEHVQSTNESi.QLVNHEANSVAVD VI PH 
VDNPTFEEDETPNQETAVRE3KS 


_ 

6854 


1148 


585 


KES YIGTFDPGELCVCAA1 CWLQDNSAS YFLNRKLVYEFSTQAK 
PVKNTFLRMW3 YSHHI YQQDLRKK3 LL^VGKRLDVTGFCKTGKPG 
1 3 CVEGFKEHCEEFWHT1RYPNWKH3 5CKKAESVETEGKG2DLR 

LFGIESKSSDS 


6 055 


1913 


1148 


GRVGGRVGR3 CSPLSGANEY 3 ASTPT1KTEEVLLFTDQTDDLAK 
EEPTSI,FQKDS£TKGESGLV^GDKE3KQ3FEDLDKKLALASRF 
YI PEGCIQRVIAAEKWALDALHREG I VCRDLNPNN3LLNDRGH3 
r»t tv rcDwcrvmcm^nii t FPnrvr"ivPPvr;AT tfftfapdww^I, 

\JL» 1 I r oKHol V E»Uo v-i/oJJM l f\ r J J v~rw c vuni 1 e».e» i cm. l/h»ol> 
GAV 1 , FELLTG KTLVECH PAG I NTHTTLN?MFEWVSEE AR 5TLI QQL 
LQFNPLERLGAGVAGVEDIKSHPFFTPVDWAELMR 


6856 


1617 


997 


VTOLYVSVDASTKDSLKKIDRPLFKDFWOQFLDSLKAI^AVKOOR 
TVYRLTLVKAWrWDEIX^AYAQLVSLGNFDFIEVKGVTYCGESSA 
SSLTMAHVPWKEE VVQFVR E LVD LI PEY £ 1 ACEHEH5KC LLI AH 
R K F K 3 GG E W WT W I ^^YICR FQE L I OE Y E DSGGS KTFS AKD Y MAR T P 
HWALFGASERGFDPKDTRHQRKNKSKA1 SGC 


6857 


3 


617 


"KGPEATAMVCVCSKPNCR0KH1KPSHSAAOTWCGSPTPASAPNH 
KI»MAMEQGKTLPSATEDAKEEGIEAQ3S'RLAELIGRLE£ KALWF 
DLOORLSDEDGTNKHLQLVROEMAVC P KQLSEFLDSLROY LRGT 
TGVRNCFH3 TAVRLSDGFTFVI YEFKETEEAWKRHLQSFLCKAF 
RHVKVDTLSQP EALSRI LVPAAWCTVGR D 


6856 


2 


669 


RSRG1 KDFENDPPLSSCG2 FOSRIAGDALLDSGIRISSVFASPA 
LRCV0TAKL3 LEELKLEKK I K3 RVEPG3 FEWTKWEAGKTTPTLM 
SLEELKEANFN3DTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
IVNTCPQDTGVJLIVSHGSTLDSCTRFLLGLPPRECGDFAQLVR 

ki pslgmcfceenkeegkv;elvnppv KTLTHGANAAFNVCRNWI s 

GN 




1 


11S0 


getmfkkaktkakkkprkrsdssggynlsdiiqspsstgllksg 
ktnsveslpelltsdsegsyagvgsprdlqspdfttgfhsdkie 
akvk p y vngts p vysredlk pweks p 3 lk 3 sapqp i p snr 1 dtt 
ssaswvagsfspvsppwblrtimeieesrqkcgatpkshlgkt 
vshgvklsqkqr kmi alttkenksgmk5t ketvlftps kapkpvn 
awasslhsv£sksfrdflleekk5vtshssgdhvkkvsr kgi en 
s0apk1vrcsthgtpgpegnhisdlplldspnpwlsssvtapsm 

VAJPNTTFASIVEEELOQEAALIRSREKPLAIjIQIEEHAICDLLVF 
YEAFGNPEEFV1 VERTPOGF LAVPMWNKHGC 


6860 


1885 


1515 


DKDKKRQKKRG 1 FP KVATN 1 MRAWLFQHLTHPY PS EEQK KQLAQ 
DTGLTI LQVNNWF I NARR3 3 VQPMI DQSNRAVSQGAAYS F EGQP 
MGSFVLDGQOh>K;iRPAGPh5SGMGMNMGKD<30WHYM 


6861 


1889 


1515 


DKDKKRQKKRG I FP KVATN I MRAWLFOHLTHPY PS EEQK KQLAQ 
DTGLTI LQVNNWF I NARR 3 3 VQPMI DQSKRAVSQGAAYS F EGQP 



563 



WO 01/53312 



PCT/US0U/34263 



SEQ 
1 V 

NO: 


fredi ctec 
bcginnino 
nucleot ice 
locat ion 
cor responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predict *-c; f-nd 
nuclecl j cf 
joca t jc : 
cor re si. ending 
to firft 
amino zcic 
residue of 
amino tcic 
seguencr 


Mino r.cic sectntr.t containing signal peptide ; 
lAt-Al e>r;ine, C=Cy • t eir.e , D=Aspartic Acid, E* ■ 
Glutamic Acid, 7 - Prie r.ylaionine, 6=Gi yc:nt , 
H=Hist. idine, 3 -2 s*ci cucine , K= Lysine, 
Lei. cucine, M=Met hicr;j ne, N=Aspar agine , j 
P-Prcline, 0=Glut«mine , R-Arginane, [ 
S = Serine, T= Threonine , V=Valine, I 
W=Tryptcphan, Y = Tyrc5.ine, x^Unknowr., *=-Stop 
Codon, /=possible nucleotide deletion, 
\=possjble nucleotide insertion) 








MGS FVLDGQQHM6 J RPAOPK.ii GMGfrfcJMGMDGQWHYK 








EEi DREyilNKLKLKEDKLEKCEKPVNGEDKGDSGVDTONSEGNA ! 
DEEDPLGPNCYYDKTKSFFDKI SCDDNREFRPTWAEERRLKAET j 
FG1 PLR PNRGRGGYRGRGGLC- FRGGRGRGGGRGGTFTAPRGFRG 
GFRGGPGGREFADFEYRXTTAFGP | 


6863 
6B64 - 

j 


2216 




PQEPALKFEFSQVASKTI PLt-LPQPNTCKDNGPCKQVCSTVGGS : 

7\1 CSCFPGYAIMADGVSC^DC DECLMGAhlDCERRQFCVNTLGSF 

YCX^NHTVLCADGYlLNAiJRKCVDlNSCVTDLHTCSRGEHCVTJTli 

GSFHCYKALTCSPGYALKDGECEDVDECAMGTHTCOPGFLCONT 

KGSFYCOARQRCMDGFLQDPEGNCVDlNECrSl.SEPCRPGFSCl 

NTVGSYTC0RNPL1CARGYHASDDGTKCVDVNECETGVKRCGEG 

QVCHNLPGSYRCDC^GFORDAFGRGCIDVNECWASPGRLCOHT 

CENTLGSYRCSCASGFLI^AADGKRCEDVNECEAQRCSOECANIY 

GS Y0CYCROGY0LAEDCHTCTD3 DECAQGAG I LCTFRCLNVPGS ! 

Y0CACPE0GYTMTANGRSCKDVDECALGTHNCSEAETCHN1OGS 

FRCLR FECPPNYVQVS K7KCERTTCHDFLEC0NS PAR 1 7HYQLN 

FQTG1 AjVFAHI FR 3 GPAP AFTGDT3 ALN 1 1 KGNEEGY FGTRRLN 

AYTGVVYLORAVLEPKDFALDVEMKl.WROGSVTTFLAKMKIFFT 

TFAL 


2 


2S • - 


LADSSPSNLOI 1 1 KELL5 MHH0PDPALTKEFD YLPP VDSR SSSG 

FVGLRNGGATCYMNAVKO0LYMQPGLPSGLLGVDDDTDNPDDSV 

FYOVQSLFGHLMESKLOYVVFENFWKIFKMWNKELYVREQQDAY 

EFFTSLID0MDEYLKKMGRDCJFKNTFQG1YSD0KICKDCPHRY 

EREEAFMAl^LGVTSCCSLEISLDOF^GEVLEGSNAYYCEKCK 

EKR1TVKRTCIKSLPSVLV J HLMRFGFDWESGRSIKYDEQI RFP 

WXKMEPYTVSGMARQDSSSEVGENGRSVDQGGGGSPRKKVALT 

ENYEbVGVIVHSGQAHAGHYYSFlKDRRGCGKGKWYICFNDTVIE 

EFDLNDETLEYECFGGEYRPK"m)QTNPYTDVRRRYWNAYMLFy j 

QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPCSSPR 

PHRPNNDRLSILTKLVKKGEKKGLFVEKMPARIYOMVRDEKIjKF 

mxnr dvy s sdy fsfvlslas lnatklkhp y y pcmakvs lola 1 0 
flfotylrtkkklrvdteewiatieallsksfdacowlveyfis 
SEGRELI Kl FLLECNVREVRVAVAT J lektldsalfyodklksl 
HOL.LEVLLALLDKDVPEKCKNCAQYFFLFNTFVOKOG I RAG DLL 
LRHSALRHMISFU.GASR0NN0IRRWSSA0AREFGNLHNTVA1.L 

vlhsdvssornvapgifkorppisiapsspllplheeveajllfm 

SEGKPYLLEVMFALRELTGSLIjALIEMVVYCCFCNEHFSFTMLH 
Fl KNQLETAPPHELKttTF QLLHSI LV1 EDP1 QVERVKF V FETEN 
GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYrKENSHH 
WSWAVQWL0KKMSEHYWTL0SNVSNETSTGKTFQRT1SA0DTLA j 
YATALLNEXEQSGSSNGSESSPANENGDRHLOQGSESPMMIGEL 
RSDLDDVDP 


6865 


I 1820 




DPERWKHLSKVTPPGSSVS'JTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLKLFYEGYAVPPKLEGIGEGEFLV 
LDQRAADYN OALGTCRL-^GTALCVAAGVLLAI CLFWAM IGWLSQ 
DTKAEPLDP2ADSH VE V FGDE P EQQLS P I FRNASGQS W FS P PAS 
PFGQSSVOTIQPKRDS 


6866 


1571 


; 


DCPRPRYTLYGLRATCMxPLDWAWINAVSAFKALEODLPVN 1 KF 
IIEGMEEAGSVALEELVEKEKDRFFSGVDYIVISDWLWISCRKP 
A I TYGTRGNSYFMVEVKCRDQDFHSGTFGGJ LHEPMADLVALLG 
SLVDSSGH I LVPG 2 YDEVVPLTE EE I NTYKA I HLDLEEYRNSSR 
VEKFLFDTKEEILMHLWR YPSLSIHGI EGAFDEPGTKTVI PGRV 
IGKFSI RLVPHMNVS AVEK0V7RHLEDVFSKRNSSNKMWSMTL 
GLHPWJAN 2 DDTQYLAAKRAI RTVFGTEPDWI RCCST3 PI AKMF 
0E 1 VHKS WLI PLG AVDDG EH S ONE K I NRWNY I EGTKLFAAFFL 
EMAQLH 



564 



BNSDOCID: <WO 01S33l2Al_l_: 



WO n]/53312 
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ID 
NO 

i 

1 

! 

i 

i 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleoli 6f 
] occt i c;. 
cor r f: sue rid i net 
to firrt 
amino acic 
residue o ^ 
amino ccid 
sequence 


Amino acid cecmfent c<:r.taining signal peptice i 
(A=.A)onine, C=Cysteixie, D=Aspartic Acid, E= i 
Glutamic Acid, F=Phenylolanine, G=Glycine, | 
H-Kist idine , 1-Isoleuci^e, >U Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Gi ut amine, R=Arginine, 
S=Serine T=Threoninc, V=Valine 
W=Tryptophan, Y^Tyrosine, X«Unknovn, **Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6867 
666fc 


2833 


1704 


GTR1MS0PKOKELAGFVROKMLLDYSVYMGKCVPQESRSPORSP 
LOS AESS PTAG KKLFEV P P SEEEEOEAWVNALLGR1 FViDFLGEK 
YWSDLV$KK 1 QMKI ,S K J Kl.F Y FMNELTLTELDMGVAVPK I LQAF 
KPYVDHOGLW 1 DLEKS YNGS FLKTLETKMNLTKLGKEPLVEALK 
VGEJGKFGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPG AEG YVGGHRTS K 1 MR F VDK 1 T KS K Y FQ KATETEF 3 KKKIEE x 
VSNTP bl >LT VEVQEC RGTLA VN I PP P PTDR VW YGFRKP PH VELK 
AR PK LG FR E VTLVK VTDW I E KK LEQE FOKVFVMPNMDDV Y I T I K 
HSAMDPRSTSCLLKDFPVEAADQP 


1 


3<m 


RPTRPPTRPEE1X>3L1LPY1 SUKNFVQDLCEDFYELF K7DKGFD 
KATFESO^S VMRGQl LNLTOAJ^RDGKSPFQLVOI PCVI VERSQG 
GSOGRIVHLSWSFTOTVNCRKPFFSSW 


6 86T- ^ 

! 


3 


163 y 


MYMERMDKRAL1SFWESVEHLKNANKNEI POLVGEt YQNFFVES 
KE I SVEKSL.Y KE I COC LVGN KG I EVFY K10EDVY ETL.KDRY Y PS 
FIVSDLYEKLLI K5EEKHAS0MI SNKDEKGPRDEAGEEAVDDGT 
NQ1NEQASFAVNKLRELNEKL£YKR0ALNSIQNAPKPDKKIVSK 
LKDEI I LIEKERTDLOLHMARTDWWCENLGT^WKASITSGEVTEE 
NGEOLPCVFVWVSLOEVGGVETKIWTVPKULSEFHNLHRKLSEC 
VPSLKKPOLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 
LCQSEAi/* AFLS PS F DYLKV I DVQGK KNSFSLSS FLERLPRDFF 
SHQEEETF.SDSDLSDYGDDVDGRKDALAEPCFMblGEI FELRGM 
FKWVRRTH ALVQVTFG RTINKQI RDTVSW 1 FS EQMLVY YINIF 
RDAFWPNGKLAPPTTIRSKEOSOETKORAGCKLLENIPDMLOSL 
VGOQNAKHG 3 1 Kl FNALOETRANKHLLYAIjMELLLIELCPELRV 
HLDQLKAGQV 


68711 

i . 

i 

I 


1 




MAAW AATRWWQLLLVLS AAGKtGASGAPQ P PN 1 LLLLMDDWGWG 
DLGVYGEPSRETPNLDRMAAEGLLFPNFySANPLCSPSRAAl.LT 
GRLP1RNGFYTTNAHARKAYTP0EIVGGIPDSE0LLPELLKKAG 
YVSKIVGKWHIX5HRPOFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NI PVYRDWEMVGRYYEEFPINLKTGEANLTQ1 YLQEALDFI KRQ 
ARHHPFFLYWAVDATHAPVYASKPFLGTSORGR YGDAVRE 1 DDS 
I GK1 LELliOrJLHVADNTFV FFTSDNGAALI S APEQGGS NG P FIXT 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS I MDLF7TSLAL 
AGLTPPSDRA1DGLNLLPTLLOGRLMDR P I FY YRGDTLMAATLG 
OHKAHFWTWTNSWENFROGIDFCPGONVSGVTTHNLEDHTKLPL 
T Ptn* J f:pr)Pf;Rl3FPT.crReapYr>r"7X.T.^I?T T<5WOOHOFALVPAOP 
QLNVCNWAVMNWAPPGCEKXGKCLTPPESIPKKCLWSH 


£5871 


209 


1121 


RMSLNPP1 FLKRSEENSSKFVETKOSQTTSIASEDPLONLCLAS 
QEVLQKAOQSGRS KCLKCGGS R MFYCY TCYVPVENVP I EQI PLV 
KLPLKI Dl I KKPNETDGKSTAI HAKLLAPEFVNIYTYPCIPEYE 
EKDHEVALI FPGPQS IS1KDIS FHLQKR I QNNVRGKNDDPDKPS 
FKRKRTEFQEFCDLNDSKCKGTTLKKI I FI DSTWNQTNK I FTDE 
RLQGLL0VELKTRKTCFWRHOKGKPDTFLSTI EAI YYFLVD YHT 
D I LKE K Y RGQ YDN LLF F YS FM Y QL I KNAKCSGDKETG KLTH 


6872 


880 




FGLLMWLSLI FMKGN CVR EDL I FNFLFKLGLDVRETNGLFGNT 
KXLITEVFVROKYLEYRRIPYTEPAEYEFLWGPRAFLETSXMLV 
LRFLAKLHKKDPOSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


192S 


95E 


DEOAVLCSKDKTYDLK IADTSNMLLF1 PGCKTPDQLKKEDSHCN 
I IHTEI FGFSNNYWELRRRRPKLKLKLKKLLMENPYEGPDSQKEK 
DSNSSKYTTEDLLDQI OAS EE E 1 MTQLQ VLN ACK I GGY WR 3 LEF 
DYEMXLLNHVTQLVDS ES WS FGKVPLNTCLQELGPLEPEEMI EH 
CLKCYGKKYVDEGFAHf FELDADKICRAAARMLLG^AVKF^NliAEF 
QEVWQOSVPSGMVTSLDOLKGLALVDRHSRPEI IFLLKVDDLPE 
DNQERFNS LFS LREK WTEED I AP Y I QDLCGEKQTI GALLTK YSH 
SSMQNGVKVYUSRRPIS 
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SEQ 
ID 
NO : 


Pred; ctec 
begi:::i:. nc 
nuclect i de 
location 
cor respor.ci nc 
to iirst 
amine ac;c 
residue of 
amino acid 
segue nee 


Predi c : ed end 
nuclecv i dc 
locat 2 on 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


An-.: no ?-cid secrmrrit containing sicr.aJ peptide 
(/.w-.ianinc, C=Cy5teinc, D=Aspartic re id, Ea 
Gj amic Acid, F= Phenylalanine, G=0iycine, 
K=K: : scidine. 1 = isoleucine, K=Ly£jH£, 
L=i.cucine, [^Methionine, N=Asparaoint , 
F=>rolane, Q=G1\;l amine,. R=Argininc. 
S= Serine, 7-Threonine, V=Valine, 
Wr. , j-yptcphan, Y=Tyrosine, X~Unknovn, * = Stop 
Cocr.n, /-possible nucleotide de2e:;c:-, 
\=-ros2ible nucleotide insertion) 




3 


>07 


DS ; ADH VN S AA VNV fc'r C- T KNLG KAAKY KLAA1>F V A.G AL« 1 GGMVG 
GF 3 C- LI .AG FK VAG I AAA 1 >GGG VLG FTGGKLI QR .X KC KMMEKLTS 
SCrDLPSQTDKKCS 


687S 


i688 


34 S 


VI C-'.'X=2RGNSASEKWE J MFNEELGDPFI I3HSJS LLNAEEHS1A 
TL^LRIEKEELDMKGSGrYVSLEVTVTISKKNODNKKVEZIKRBI 
LRG KSVFHYAAI EPIX;KGLMI VSyKSLTFVQAGQDL5:;SNMDED3 
SE}.:KEPLYYWOOTEDr'LTVTIRLPEDNTKED1010rLPDHINI 
VLK DHOFLEG KLYSS I DHESSTWI 2 KESNSLE 1 S L2 K KM EGLTW 
PEI.VJGDKQGELIRDSAOCAAIAERLMHLTSEELNFNPDKEKPP 
CN ACE LEECD 1 F r EE S £* S LCR FDGNTLKTTHWM LGSNQ Y LFSV 
1 VT ? KEMPCFCLRHDVDA LLWQPHS S KQDDMWEH I AT FNALGYV 
QA5.XRDKKFFACAPNy£YAALCECLRRVF3YRQPAPKSTVLYNR 
KEO RC VGOVAK0QVAS LETNDP3 LGFQATNERLFVLTTKNLFLI 
KVNVEN 


6876 


42 


1285 


VGF.M71. 1 WRHLLRPLCLVTSA PR I LEMHPFLS LGTSR TSVTKLS 
LH'J K PKMPPCDFMPEK YQV I FLVNSGSEANELAKLWARAHSNKI 
Dl j P FRGAYHGCSPYTLOLTNVG1 YKMELPGGTGCOFTMCPDVF 
RGFVPGGSHCRDSPVQT! RKCSCAPDCCQAKDQY 1 ZQF KDTLS TS 
VAK £ I AG F FAE P I QGVNG WQYPKG FLKEAFEL VRARGG VC I AN 
EVC'TGFGRLGSHFWGF0T}IDVLPDIVTMAKG1GNGFF.'-1AAVITT 
PE3 AKSJLAKCL0HFNTFGGNPMACAIGSAVLEV1 KEENLCENSQ 
EVCH V MLLKFAKLRDE FE 3 VGDVRGKGLMIGI EM VQFJK I SCR PL 
PRFFA'NOlHEDCKHMGl.i.VGRGSIFSQTFRIAP.SKCrrKPEVDF 
AV h V ¥ R S A LT G/HMER RA VC 


6 877 


: 


77 F 


GTS 7 i'PAKAYAPPTERKRFyONVSITOGEGGFEINLDHRKLKTP 
QAK l» FT VPSEAIAI AVAT LWDSQQDT I KY YTMH LTTLCNTS LDN 
P7 CKKKDQLI RAAVKFLDTDTI CYKVEEPETLVELCRNEWDPI I 
EWAi: KRYGVE 1 5 SSTS J MG PS 1 PAKTREVLVSHLAS YKTWALQG 
I E F VAAQLKSMVLTLGL 1 DLRLTVEOAVLLSRLEEE YC 1 QKWGN 
I E KAH D Y ELQELRARTAAG TLF I HLCS ESTTVKH KLLKE 


6 8/8 




2 1. 


0TLC-GDFKNRAFK3 DFNJ RI KNVTRSDAGXYRCEVSAFSEQGON 
LE E I- 7 VTLE VLVAPAVPS CE VPSS ALSGTWELR CODKEGN PAP 
EYTV;rKDGIRLLENPRlX:SOSTNSSYTMWTKTGTLOn>3TVSKLD 
TGEY 5 CE ARNS VGYRRCPG KRMQVDDI>NISGI 3 AAWV'"VALV1 S 
VCGLGVCYAQRKGYFSKETSFQKSNSSSKATTMSENDr KHTKSF 
11 


6879 


2 


84 b 


IRVlGESDJMQKFLSESDKNYNGVSDVEbRVALPnGTTVTVRVK 
KN £7 T DO VYQA I AAKVG M DS TTVN YFALFEV3 SH S F VR KLAPNE 
FPKK.LYlONYTSAVPGTCLTIRKKLFTTEEEILLrNDNCLAVTYF 
FHCAV DDVKKGY I KAEE K£ Y QLQKLY EQRKMVMY LN KL-RTCEGY 
NEj j FPIlCACDSRRKGJrVl T AI S 1 TH FKLHACTE EGOL ENQ V IA 
FEK:,EMQRWDTFEEGFiAFCFEYARGEKKPRWVK i FT PY FNYMHE 
CFERVFCELKWRKEEY 


6880 


2110 


1437 


RKD^CTAKEWTFPEAX^NTTARVFSHI RLGMGHVL1 I VQCFI SS 
MAN 3 YNEKILKEGNQLTES I FIQNSKLYFFGILFNGLTLGLORS 
NRDC -2 KNCGFFYGFfRAFSVALI FVTAFOGLSVAFI LKFLDNMFH 
Vl>*iAOVTTVI 3 TTVSVLVFDFRPSliEFFLEAPSVLLS 1 F 1 YNAS 
KPCV? E YAFROER I RDLSGNLWER5SGDGEELERLTKPKSDESD 
EDTT 


6881 


263t 


2244 


NDSh'KEDJ HV I J G ALKM F F R E L PE P JjF'1' FNHFN D F \TN A I KQEPR 
QRV^AVKDLIROLPKPNODTMOILFRHLRRVIENGEKNRMTYOS 
I A 3 V F G PTLLKF E KETGN 1 A VHTVYQNQ I VEL2 I LELS S 3 FGR 


6882 




850 


G3FrAOLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLV3N 
QEGKK VTAROEFRLVLI SLTCDGDTLTLSAAYTKObLLPI XTPT 
TN A*\*H K CR VHGLE I EGR I>CG E ATAQW I TS FLKS C F Y R hVH F E PH 
hmPF?.FHOIADLFRPKDC'lAYSDTSPFLIbSEASLADUN T SRLEX 
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seq' ~ 

NO: 


i reci ct ec 
i"/fcginn2 ng 
:iucleot :de 

ccrreq cndi ng 
to fir^i 
omino acid 
residue of 
amino scic 
sequence 


Predicted end 
nueleot ide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


no eCja segnent containing r.iynal peptide 
(,\ = A1 arj r.e , C^Cys' t- ine , D-Aspirtic Acid, E^ 
Glutamic Acid, F<- Pnenylal oniric , G=Glycine, 
K-Kistid:ne # I=Isoi eucine, K=Lysine, 

Leucine, M=Methicr.ine, N=A&-aragine., 
P^r-'roline, Q=Glutarr : tie , R^rcirine, 
S- Serine, T=Threon^ne, V=Vj;l:r.c, 
V. 1 -Tryptophan, Y=Tyrcsine, X-'Or.known, *=Stop 
Cooon, /-possible nucleotide deletion, 
\- possible nucleoside insertion) 








K V KATNFh r N I VI SGCDV VAEDSWDELLl GDVELKRVMACSRCI 
I.TTVDPDTCATiSRKEPLETLKSYROCDPSIKKLYGKSPLFGQYF 
YLENPGT3 KVGDPVYLLGC 


6883 


2754 


2256 


i:5KbKLNCKLKLFITLTyOVLSl.HGWGPG3HLQKEGAFPVTOKR 
A I^LLYDLS Y LNI VLTA KG DEV KSGRSKPLSRl EKVTDHLEAL I 
D7 FDLDVF7 PHLNSNLHRLVORTSVLFGLYTGTENOLAPRSSTF 
NS CEPHN3 LPLASSQI R FG 1 .LPLSMTSTR KAK STRN1 ETKAQYD 

AKC 


6884 




S2 


ErERVTAEAVKPRETSEPRAAAU;RFCEKFPFi> 


6885 


?£•"/ 


1554 


51 GOFWHVTDLHLDPTVH I TDDHTKVCAS5 KGANASNPGPF GDV 
LCDSPY0LILSAFDF1KNSG0EASFM1WTGDSPPHVPVPELSTD 
TV ■ ?JVI TNMTTTIOSLFPNLCVFPAl^GNHDYW PQDOLSWTSKV 
YN'AVANLWK P WLDEEA 1 5TI R KGGF YSQK VTTN PNLR I ISLiNTN 
L Y Y G PN I M 71 N KTDP ANQ F EKL E S TL»NN S CON KE K VY 1 1 AHV P V 
GY) .rSSONl TAHREYYNEKL3 Dl FOKYSDVI AGQFYGHTHRDS3 
KVL SDXKGS FX^NSLFVAPA VTPVKSVLEKCTNNPGIRLFQYDPR 
DV KILDMLCYYLNLTEANLKGESIWKLEYZLTOTYDIEDLQPES 
1.YGLAXQF1 J LDSKQF1KY YNYFFVSYDSSVTCDKTCKAFQICA 
J MNI-DNI SYADCLKQLY 3 KJtNY 


6886 




1311 


oggg:pcreggssrpleegtgsspacvrga*,pgsedafyptrak 
car vsqelk kaaxrtvs 1 s egpdtlgdghke rretlalape pe p 
1kkeacekwkjrpfrsasatsltlshcvdvvkglldfkkrrghs1 

GGAFECRYC: 1 P VCV AAR L PTRAQDVLD AJILS E VNAVR FG PNS S 
LLATGGADRL1 HLWNVVGS R IE ANQTLEGAGGS I T£ VDFDPSG Y 
C'Vl^TYNOAAQLWKVGEACSKETLSGHKLKVTAAXFXLTRHCA 
VlGSRDRTVKEWDIXSRAYCf RTINVLS YCNDVVCGDHI I ISGHN 
D0K2RFWDSKGPHCTQV1PV0GRVTSLSLSKD0LHLLSCSRDNT 
LKV 1 DL>R V C N 1 R QVFRADG F KCGSDWT KAVFS PDRS Y ALAG SCD 
GAL Y I WDVPTGKLESRI^OG PH CAAVNAVAV.CY SGSHMVSVDQGR 
KVVLWQ 


68B7 




116 


KTARPSQKFFWEAGAVPGDPLSTGCSQAOLGGCCPRGPWGPQHG 
GCC-RAAG PT L PRGERGGFOOS G PGLAAQT P PT S KQVAWRAFLTG 
TYP^OSPRSFAGPFRGGTGV.-WPEPAVCLCVAVGPORLSSPGLVY 
NASGEEHCYDIYRLYHSCAEPTGCGTGPDA^AWDYOAGTEINLT 
rA^KNVTDKFPDLPFTDELRORYCLDTWGWFRPDWLLTSFWGG 
DL KAASN 1 1 F SNGNLDPW AGGG I RRNL S AS V I AVT I QGGAKHLD 
LKASHPEDPASWEARKLEATI IGEWVKAA^REQQPAIjRGGPRL 


6888 


J 


992 


FVAYVKKE I PH I WTHCLLK PHALVl KTLPTKLRDALFTWRV I 
NFJ KGRAPNKRLFQAFFEE ] Gl EYSVLLFHTEMRWLSRGCILTH 
1 FEWYEEIKOFLKHKSSNLADGFENKEFK ■ HLAYLADLFKHLNE 
LS AS MQRT GKNTVS AR EKL 5 AF VRKF PFWC-KR I EKRN FTNFPFL 
EE1 1 VSDNEGI FIAAEirLHLOOLSNFFKGyFSIGDLNEASKWl 
LP? FLFN1 VFVDDS YLMKNELA5LRASG0 1 LMEFETMKLEDFWC 
AO?TAFPNU>JCTALEILMPhA'rTYLCEU;FSITFTFONKVPEAA 

Lj lsddirvai skkvpsflghh 


6889 


J 


1534 


L-TLENQI KE K.R EQDNS ES PNGRTSPLVSCNKEQGSTLRDLLTTT 
ACKLRVGS7FAGIAFAPVY5MGAPSSKSGR7KPNILDDIIASVV 
ENK I PPSKTS KINVKPELKEEPEESI ISAVDENNKLYSDIPHSW 
I CE ru^l LWI.KDYKNSSNWKLFKEa^KOGQFAVVSGVHKKMNISL 
WKAES I S LD FGDHQADLLNCKDS 1 1 SN ANVK E FWDGFEEVS KRO 
KJv : :-;SGETV\'LKLKDWP£GEDFKTMMPARYEDLLKSLPl/PEYCNP 
EGKFNIASH^PGFFV^PDLGPRLCSAYGV^AAKDHDIGTTNLHI 
EVSDVVN1LVYVG1AKGNG: LSKAGILKKFEEEDLiDDlLRKRLK 
DSSEI PGALKK I YAGKDVDK 1 REFbOKl SKEOGL.EVLPEHOP1 R 
DOS W rVNKKLKORLLEEYG VRTWTM QFLGDAI VLPAGAL.HQVC 
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SEC 

li- 
ne 


Prt di cteci 
bec inning 
nu< j eotide. 
lot £t ion 
cc: res pond i ng 
to lirst 
ammo acid 
re; idue of 
an.no acid 
secoence 


Predicted end 
nuci eot ide 
location 
cor l e spending 
to first 
anino acid 
residue ol 
amino acid 
sequence 


Amine acid segment containing siynal peptide. 
(A=Aianine, C»Cysteine, D=Aspartic Acid, e= 
Glutamic Acid, F- Phenylalanine , G=Glycinfe r 
K=Hietidine t I=lsoi tucine , K-Lysine. 
L- Leucine, M=Methicnine , N~Asparagine , 
P=?ro]ine, Q=Glut amine , R=Arcinine, 
S^Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y^Tyrosine, X^Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possibie nucleotide insertion) 


_ — — 






NFHSClOVTEDFVSPEHLVF.SFHLTCEbRl.LKEElNYDDKLOVK 
N I LYHAVKEMVRALKI HEDE VDDMEEN 


6890 


3 


6 67 


TKACGKWI PLYLKRALWKKTAETCNSPPCGAKDSLI FGAITCF 
TG FLG VPTG AG ATR W CRLKTQRADP LVCA VGM LGS A I FI CLI FV 
AAKSSI VGAYI CI FVGETLLFSNWAI TADILMYWI PTRRATAV 
ALQSFTSHLU?DAGSPYLJGF3SDLIRySTKX>SPLWEFLSLGYA 
LMLCPF\AA'LGGMFF1^TAI,FFVSDRARAE0QVK0LAMPPASVK 
V 


6891 


19B0 


1262 


I F lHOFM.SKFLKLLRGTT'I F^T I HIGI .AAGK.EGFM0DASNVMO 
LLLKTOSHLYWMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 
AVPLLVKV 1 KRAHSKTKKNV 3 ATENCI SA3GKI LKFKPNCVNVD 
E V LP HVvLS W LPLHEDKEEA3 QTLS FLCDL T ES NHP W 1 GPNN SN 
LPK1 1 SI lAEGKINETINYELPCAKRLANWR^VQTSEDLVJLEC 
V S OLDDEOOEALQELLN F* 




3 
- 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV 
r ALKArNVtJLiAUv]\_Ut Vi>fc-K 1 J^JjnKnUr « a J\ I VtUJ iLUW^rH 

0PVA3 ELWKAVKRHNLTKRWLMKJVDEREKNLDDKAYRNIKELE 
N Y A ENTQS S LL Y LTLE I LG 1 KDLHADHAASH J GKAQG3 VTCLRA 

1 r I rH.»i; KX 2\ w f Jji^niyil LPJiiHy VoyLi/r i *X KT* V(J.V' v lsDv x I LiJrt 

SOAHLHLKHARSFHKTVrVXAFPAFLOTVSLEDFLKKlQRVDFD 
1 FHPSLQQKNTLLPLYLYI OSWRKTY 


842 


CX^ERKSMSVERTFSElNKAEEOYSLCOELCSELAQDLOKfcRLKG 
RTVTI KLKNVNFEVKTRASTVSSWSTAEE1 FAIAKELLKTEID 
ADFPHPLRLRLMGVR 1 SS FF NEEDRKKQQRS I x GFLQAGNQALS 
ATFCTLE'KTDKDKFVKPLEMSHKKSKFDKKRSERKWSHODrFKC 
EAVNKOSFQTSQPFQVLKKKf-INENLE I S ENS D D CQ I LTCPVCFR 
AOCCI SLEALNKHVDECLDG PS 1 SENFKMFSCSHVSATKVNKKE 
NVPASSLCEKQDYEAH 


6694 H 


1742 


2463 


TTLCKPLVPREHQFYETLPAEMRKFTPOYKGKSQLLEGLPHWRG 
DVKDRGHGRPWQPSLEPSLFPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895:, 


2319 




VTYVF.LCDLASPTALLIMRTVLDLIVEDLOSTSEDKEQOYTSQT 
TRLLALLYALASHKACKXA3 LHL3 NGT1 KGHERYAEI FQDLLAL 
VRSPGDSV IRQQCVEYVTS1 LQSLCDQDI ALI LPSSSEGS3 SEL 
EQLSNSLPNKELMTS I CDCLLATLANSESSYNCLLTCVRTMt4FL 
AEilDYGl.FHLKSSLRKNSSALHSLLKRVVSTFSKDTGELASSFL 
EFMRQILNSDTIGCeGDDNGLMEVEGAHTSRTMSlNAAELKQLL 
OSKEESPENLFLELEKLVLEHSKDDDNLDSLLDSWGLKOMLES 
SGDPLPLSDODVEPVLSAPESLONLFT7N7?TAYVIJU3VKDDOLKS 
MWFTPF0AEE1DTDLDLVKVDLIELSEKCCSDFDLHSELERSFL 
SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEYIEPAJC3AHVV 
PPPRGRGRGGFG0GIRPHD3FR0RK0NTSRPPSMHVDDFVAAES 
KEWPODGIPPPKRPLKVSC'KISSRGGFSGNRGGRGAFHSONRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVSGGSGRGRHVRS FTR 


6696 


1 


555 


GNlVIOKKKYNKQHlIPLEN^TlDSIkDEGDLRNGWLlKTPTKS 

fa vyaatateks ewmnh i nk cv7dlls ksgktpsnei£aavwvpd 
seatvcmrcqkakftpvnrr>:hcrkcgfwcgpcsekrfllpsq 
sskpnmicdfcydllsagdmatcoparsdsyscslksplntjmsd 
pdddddssd 


6897 


3 


920 


G DG LMHE VVN G LME R PDW ET A 1 QK PLCS LP AG SGNALAAS LNHY 
AGYEQVTNEDLLTNCrLLIXRRLLSPhlNLLSLHTASGLRLFSVL 
SUVWGFIADVDLF^EKYRRLGEMRFTIjGTFLRliAALRTYRGRLA 
YLPVGRVGSKTPASPVWQCCPVDAHLVPLEEPVPSHV/TVVPDE 
nF\7^VIALLKSHU?SEMFAAP«GRCAAGVMHLFYVRAGVSRAML 
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ID 

NC: 

1 

i 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted < nd | 
nucleotio* | 
locatior. 
corresponding 
to first 
amino acic , 
residue ci 
iimino acic 
sequence 


.Amino acid segment contain! nc signal pept i ce 
(A=Alanine, OCyst <r i i.e , Dispart ic Acid, r.^ 
Glutamic Acid, F=Fhf r.ylalan- nc , G=Glycinc, 
K=Hist ieine, I^Isoicucine, Lysine, 
L= Leucine, M^Methi c;r: ne , N»Aspar agir.e, 
P=Proline, 0=Glutam:-e, R=Are5nine, 
S= Serine, T=Threonine, V=Val:ne, 
W- Tryptophan, Y« Tyrosine, X- Unknown, *=>Stcp 
Codon, /=possible nucleotide deletion, 
\=rpossible nucleotide insertion) 








LRLFLAf^FKGRHMEYECPYLV^PVVAFRLEPKDGKGVFAVDGE 
LMVSEAVOGOVHPNyFWMVSCJCVEPPPSWKFOQMPFPEEPL 


! 6ft96 




34t 


OKTVTAVASLLKGRCG IYTEN LRRMGAV1 K I R FFKI MLVLI 1 CW 
LSN2INESLLFYLEMQTDINGCSLKPVRTAAKTTWF3MGILNPA 
QG FXLS LAF YG WTGCS LG FQ. r P R KE 1 QWESLTTSAAEGAH FS PL 
MPHENPASGKVSQVGGQTSDCALSMLSECSDASTIEIHTASESC 
NKNEGDFALPTHGDL 


1 6900 


120 


827 


MKVKKNNCAYLLDKNKlNMDCFISCFFKKr^LTTLMFSKSGlLSL 
LEHGEEYTFSLPCAYAJ^SILTVPWVELGGKVSVNCAKTGYSASI 
TFHTKPFYGGKLHRVTAEVKHN1TNTWCRVQGEWNSVLEFTYS 
NGETKYVT>LTKLAVTKKRVHFLEKODFFESRRLWKNVTDSLRES 
E 3 DKATEHKHTLKERQRTEE* K PTETGTPKKTKYF1 KEGDGWVY 
HKPLWKI I PTTQPAE 


I 


3 


45j 


TEVLGSKG1KELRSSTSALH1IALEESASLLTMFWRAALPSTHIP 
VLPGKVGESTERELLELRTKVSOOEQLLOSTTEHLKNANQQXES 
:<E0FIVS01»TRTHDVLKKAJ\TNLEVRKL;LH0SEAPSLSPTHHHP 
LADLVGDSWPALR FQEK 


j 6901 


2 


20Z 


DDNMVQRLE TDFKMThQQQSZL EQ WAAWLDIW1MQALKP YEGR P 
SFFKAAROFLLKWSFYRYHLCFC 


| 6902 


2 | 267 


GAPPPPPSQPPRQPPQAAPSSKPHSDLTFNFSSALEGQAGAOGA 
SDMPEPSLDLLFELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 




149 


R INQVYR03PTGI HI LVI DOMVONFQDKSCFLFSTVKAESSDGl 
HI ILK 


6 904 


464 


2092 


VGNFFGSTQDAEWEEYKTCi K <AP1 QT YVLGANNQETVKYFQDA 
^C4C*F LfiBKni FTO ^ c C-LOI VY LSGTE SLNE PVPG Y S F 
SPKDVSSLRMMLCTTSQFKGVCILLTSPKPKCVGNrGNSSGEVD 
TKKCGSALVSSLATGLKPRYK FAALEKTYYERLP YRNHI I LQEV 
A0KATRFIALANVGNPEKKKYLYAFS3VPWK1.MDAAELVK0PPD 
VTENP YR KSGOKAS I GKQI LA F VEESACQFFFDLNEKQGRKRS S 
TGRDSKSSPHPKOPRKPPOPPGPCWFCliASPEVEKHLWNIGTH 
CYI.ALAXGGLSDDPTVLILP3 GKY QS VVELSAEVVEEVEKYKATL 
RRFFKSRGKWCVVFERNYKSKKLQLQVI PVP1SCSTTDDIKDAF 
I TQAQEQQ TELLE I PERSDIKO I AQPGAAYFYVELDTGEKLFHR 
IKKNFPLOFGREVLASEAlLi:v?DKSDKROCOISKEDEETLARR 
FRKDFEPYDFTLDD 


G90i> 


1 


226 


VSKTGEAET1TSHYLFALGVYR7LYLFNWI WRYHFEGFFDLIA1 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


SYDDHNGH2DF1TAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTTATVSGLVALEM1 KVTGG Y PFEAY KJvKFLNLAl PIWFTET 
TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWQGVKMLYVPVT1PGHAKR LKLTMKKLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYfSHDTD 


6907 


2 


2228 


LRG VPVWAAGAFR FS S GEES T? :-: L I MSR RSQR LTRYS QGDDDG S 
SSSGGSSVAGSCSTLFXDSPLK7LKRKSSNMXRLSPAPQLGPSS 
DAHTSYYSESLVKESWFPPRSSLEELHGDA^GEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDOOSS 
£S RLRSAVSRAGSLLWMVATS FGRLFRLLYWWAGTTW YRLTTAA 
SLLDVFVLTRRFS SLXTFLW FLi PLLLLTCLTYGAWY FY PYGLQ 
TFK PALVS WWAAKDSR RADEG W E ARDSS FK FQAEQRWSRVHS L 
ERRLEALAAEFSSNWOKEAMkLERLELROGAPGQGGGGGLSHED 
TLALLEGLVSRREAALKEDFR R ETAAR1OEELSALRAEH0QDSE 
DLFKKIVRAS0ESEARIOOLKSEW0SMTCESFOESSVKELRRLE 
D0LAGUX?FI^AALALKQSSVAEEVGLLPO0IQAVRDDVESQFPA 
Wl S 0 FLARG GGGR VGLLOREEM C AQLRELE SKI LTKVAEMQGKS 
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ID 
NO 


begi nni no 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl ecticie 
iocat i ori 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


frn.ino Ana seom^nt containing signal peptice 
(A=Alamne, C-Cysteine, D=Aspartic Acid, E= j 
Glutamic Acid, F=PhenyIolanine, G=G3ycine, 
H- Histicme , 1 = 1 soleucine , K=Lysine, 
L- Leucine , M=Met hiomne , N=Asparagine , 
P=Proline, Q=Glu tamine, P=Arginine, 
S-Serine, Threonine , V^Valine, 
W^Tryptcphan, Y^= Tyrosine, X=Unknown, *~Stop 
Ccdon, /^possible nucleotide deletion, 
\-possibie nucleotide insertion) 









AR E AAAS L 5 1 »T LOKEG V I G VTEEQ VHH 1 VKQALQR YS E DRIGLA - 
D Y ALESGG AS V I STRCS ET Y ETKTALLSLFG 1 PLW Y K S QS PR VI 
LOPDVHPC? ?CWAFQGPQ)GFAWRL 5ARIRPTAVTLEHVPKALSP 
KSTISSAPKDFAIFGFDEDLOOEGTLLGKFTYDODGCr-IOTPHF 
0APTMATYCVVELR3LTNWGHPEYTCIYRFRVHGEFAH 


6S0B 


3 


7 60 


qvpsaaw1^vcglgsrlgi>gsrlgl0gcfgaarllyfrf0srg 
pogvedgdkpopssktpripkiytktgdkgfsstftgerrpkdd 
ovkeavgtjdelssaigfalelvte:<ghtfaeelokioctlqdv 
gsalatpcl'sareahlkyttfkagpileleowidkytsqlpplt 
af1 lpsggki ssalhfcravcrkaxrrwplvqmgetdai^vakf 
lnrlsdylf tlaryaamkegnqeki ykkndpsaesegl 


6909 


3 


4 C Sr" 


GRLLAVGT L :LYGQRSSAPEOELLVODATPVSNSLliPEKAFSDIF 
SPYliRGTJKMMQAVRQAFQDQDDRRTWDGRPLTWAATFDDCLYA 
LCWDT3 K? SSQTGEWQN3A3MTEEPELSPAYLI SEAMRRSRMS 
LYC 


6910 


1 


306f 


LVPVW3DSYYYGKLV1APLN1VLYNIFTPHGPDLYGTEPWYFY 
LlNGFLNFIA'AFAIALLVLPLTSLMEYLLORFHVONLGHPYWbT 
LAPMYIWF: IFFI0PHKEERFLFFVYPL1CLCGAVA1-5AL0HSF 
LYFQKCYHIVFQRYRLEHYTVTSNWLAJlGTVFLFGLLSFSRSVA 
I/FRGYHGPLDLYPEFYR-ATDPT3HTVPEGRPVNVCVGKEWYRF 
P S S F LL PEN W 0 LOF 1 P £ E FRGQL F K P F AEG PLATR 1 V P T DMN DO 
NLEEPSRY3DISKCHYLVDLDTMRETPREPKYSSNKEEVJISLAY 
RPFIjDASM ? KLLRAF YVPFLSDOVTVYVNYT I LK PR KAKQ3 RK 
KSGG 


6911 


1184 


96C 


GEDAEEMETGNVANL1S3FGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGC1 CCDKPVLRDMNPWSTAIVAF 


6912 


1 


644 


AKKPVETHi FOMLFTl LSTGSALKAOSYEDAYRCI KSK J LI&S1 
SGGTDI 3 SCFMGHNFSLPVYKGE3 CARNIjGMAVEAWNEEGKAVW 
GFSGELVCTKPI PCQPTHFWNDENGNKYRKAYFSKFPG3 VJAHGD 
YCR I NPKTGG I VMLGRSDGTLNPNGVRFGSSE I YNI VE S FEEVE 
DSLCVPOYKKYRSERVILFLKMASGHAFOPDLVKRIRDAIRMGL 
S/\RHVPSL ] LETKGI PYTLNGKKVEVAVKQ1 3 AGKAVEOCGAFS 
NFETLDLYRDIPELQGF 


6913 


1643 


. 1 E 5 8 


KKfHEESHKEELSYGAQASLPLPCSDFR 


6914 


12S1 




ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSOKSGVD3 
C3 NN AGLAk PDT1>LSGSTSGVJK3DMFNVNVLALS 1 CTREAYQSMK 
ERNVDDGKJ I N2 NSMSGHRVLPLS VTHFYSATKYAVTALTEGbR 
0ELREAQTH3 RATC3 SPGWETQFAFKLHDKDPEKAAA7YEQMK 
C1.KP ED V AEAV I YVL>ST PAH1QIGDI QMRPTEQVT 


6915 


254 


652 


GRSLSFKTF 1,1 WVi>3 S 3 YOGG I LMYGALVLFE SEFVH W A 3 Sr T 
Ai.3 LTELLMVALTVRTWH WLMWAE FLS LGCYVSS LA FLNEY FD 
VAF3 TTVTF J -WKVSA3 TWSCLPLYVLKYLRRKLSPFSYCKLAS 


6916 


254 


652 


GRSLSFKTKL3WVL»3S3YQGGII>5YGALVLFESEFVHWAISFT 
AL1 LTELLWALTVRTWHvnjMWAEFLSLGCYVSSLAFLNEYFD ! 
VAFI TTVTF LWKVSAI TWSCLPLYVLKYXRR KLS P PSYCKLAS | 


6917 


254 


652 


GRSLSFKTFl,IVm.IS3YOGGILMYGAbVLFESEFVHVVAISFT 
AL3LTELLWALT\mTvmWLMWAEFLSliGCyVSSLAn J NEYFD 
VAFI TTVTFLWKVSA I TVVS CLPLYVLKYLRR KLS PPS Y CKLAS 


6918 


28 


921 


peagtrswrepdpedlrrfllsaacrsfpowlpgggggovsscs j 
dtdvpylllavksepgrfaeroavretwgspapgirllfllgsp i 
vgeagpdloslvawesrrysdlllwdfldvpfn0tlkdllllaw 
lgrh cptvs fvlraoddafvhtpallahlral ppas ah slylge 
vftcamplrkpggpfyvpesffeggypayasgggyviagrlapw 
llraaar vap fpfedvytglci ralglvpoah pgfltawpadrt 
adhcafrnlllvrplgpoasirlwkolqdprlqc 



570 
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PCT/US00/34263 



r seo 

i NO: 



I 



I 

|~"GS19~ 



6920 



6921 



6922 



6923 



6924 



6925 



6926 



iPrtca cted 
bee: nning 
nucleotide 
locat ion 
cor respond inc- 
co first 
amino acid 
residue of 
am; no acid 
secuence 



Predict i:c end 
nucl eoti de 
j oca tier 
cor responding 
lc first 
anino acic 
residue of 
trr.ino acid 
ec-cruence 



bSO 



Amino acn segment containing signal pept ice 
(A=Alanin , C^Cysteme, D=Aspartic Acid, L = 
Glutamic kid, F=Pnenylalanine, G=Glycine, 
H^Histidirjf , I=lsoleucine, K^Lysine, 
L=Leucine. Y»- Methionine, N=Asparagine , 
P=Proline. 0= Glut amine, R=Arginine, 
S^Serine, 7 -Threonine , V^Valine, 
W*Tryptop:.on, Y^Tyrosine, X-Unknown, * = Stcp 
Codon, /wprsFible nucleotide deletion, 
\=possib}< nucleotide insertion) 



QGRRELSGSV! CPFI QQEPKEMLTLSEYHERVRSGGQQLQCLCA 
ELDKLHKEV 5? '. ' VRAAN SF.R V AKLV FQRLN EDFVRKPD Y ALS SVG 
AS I DLQKTSK. YADRNTAY FWNRFS FWNYARPPTVI LEPHVFPG 
NCWAFEGDQGC WIOLPGRV0LSDITLQHPPPSVEHTGGA.VSAP 
RDFAVFFLLf? FTHOGLOVYDETEVSLGKFTFDVEKSEIQTFKL 
QNDP PAAF P KV K 1 0 1 LSN WGH P RFTCLY RVRAHGVRTS EGAEGS 
AOGPH 



1418 



591 



J7i: 



10 75 



261- 



2469 



1660 



2210 



1653 



I EAOGPSKVHL: LKKKK 

MNATRS E EQ F !~ V I NHAEQTLR KMEN YLKE KQLCDVLL I AGH LR I 
PAHRLVLSAVS £Y F AAMFTNDVLEAKQEEVRMEGVDPN ALN SLV 
OYAYTGVLQLKUDTT EPLLAAACLLQLTQVIDVCSNFLl KQLHP 
SNCLGIKSFG^AOGCTELLhA/AHKYI-MEHFIEVIKNQEFLLLPA 
I NEISKLLCS£r j KVPDEET3 FHALMQWVGHDVQNROGELGMLLS 
| Y I RLPLLPPO- 1lADLETSSMFTGDLECQKLLMEAMKYHLL,PERR 

i smmosprtkpk :<stvgalyavggmdamkgtttiekydlrtnswl 

j H 1 GTMNGRR LC ) : G V A V I DM K LYVVG GR DG LKTLNTV E C FN P VG K 
| I W T VM p PMS TH K HG LG VATLE G PM Y AVGGKDG WS Y LNT V ER WD P 
[ EGROMNYVAS K <• TPRSTVGWALNNKLYMGGRDGSSC1 >KSMEY 
I FDPHTNKWS LC/..PMS KRRGGVG VATYNGFLY WGGHDAPASNHC 
I ERLSDCVERVr • KGDSWSTVAPLSVPRDAVAVCPLGDKLYVVGG 
YDGHTYLNTV1 5 YDAQRNE W K E E VP VN I G RAG ACVWVK LP 



I LTPPAGIRHtVK DKERERERliUERKKFPLDSTGSELKQNlHSlT 
I GLPPAMOKVKy):GLAPEDKTI.REIKVTSGAKIMGGGSTINDVlA 
I VNTPKDAAOOL^. KAEENKKEPLCROKQHRKVLDKGKPEDVMPSV 
i KGAQERLPTVl ; SGMYNKSGGKVRLTFKjLEQDQLWIGTKERTEK 
I LPMGSIKNV\^EPIEGHEDYHMMAFQLGPTEASYYWVYWVPTQY 

VDA: KDTVLGKWyYF 

LGLFCIliPIE'; 1 CAVLERDTLS IRESRLFGAVVRWAEAECQSQQ 
LPVTFGNKQKVi GKALS LI R F PLMT I KEFAAGPAQSGI LSDR EV 
j VNLFLHFTVNJ >: PRVEYIDRPRCCLRGKECCINRFQQVESRWGY 
! SGTEDR I R FTW kRISIVG FGL YGS 1 HGPTDYQVN IQI3EYEKK 
I QTLGQNDTGF5 CDGTANTFRVMFKEPIEI LPNVCYTACATLKGP 
I BSHYGTKGLK*: V ^KETPAASKTVPFFFSSPGNNNGTSIEDGQIP 
I E3IFYT 



12 3 5 PEERVICFVEyybTAFHEGRKGALAKKPYNPIlGETFHCSWEVP 
: KDRVKPKRTA5SSPASCHEHPMADDPSKSYKLRFVAEQVSHKPP 
i I SCFYCECEEK 1 I,CVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 
■ HGEEYVFTLPf;.YARS3LTIPWVELGGKVSINCAKTGYSATVIF 
| HTKPFYGGKVK.^ VTAEVKHNPTNTI VCKAHGEMNGTLEFTYNNG 
j ETKV I DTTTL) V VPKKIRP LEKCGPMESRNLKRE VTR YLRLG DI 
DAATEQKRHLL I KQRVEERKRENLRTPWKPXYFIQEGDGSG1 LQ 
I SPLESTLMGLEVQSFPV 



733 



rggaagaamef: s V3EDKTI elmcsvprslwlgcanlvesmcal 

SCLOSMPSVRC;,03SNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVrHLlSRMCHYOHGHINSYLKPMLORDFITAL? 
EQGLDHIAEN: I SYIJ3ARSLCAAELVCKEWQRVlSEGMLWKKJjI 
ERMVRTDPLWKGLSERRGWDOYLFKNRPTDGPPNSFYRSLYPKI 
IODI ETIESNVK CGRHNL0R 1 OCRSENSKGVYCL0YDPEKI I SG 

LRDKSI kiwdk: S leclkvltghtgsvlclqydervivtgssds 

TVRVWDVNTGF V LNTL1 HHN EAVI.HLR FSNGLMVTCSKDRS I AV 
WDMASATDITL7- R VLVGFJIAAVNWDFDDKYI VSASGDRT3 KVW 
STSTCEFVRTLNGHKRG1ACLQYRDRLWSGSSDNTIRLWDIEC 
GACLRVLEGHF K ^VKCI RFDNKRI VSGAYDGKI KVWDLQAALDP 
RAPASTLCLRT; VEHSGR VFRLQFDEFQI I SSSHDDTI LI WDFL 
NVPPSAQNETRS PSRTY7Y I SR 



SGRVAMDGLGK F P5QGFPAGPPLLPPHMGGHYRDCQSLGAPPL 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
locF.tion 
corresponding 
to first 
ammo acid 
residue o£ 
amino acid 
sequence 


rreolctec end 
nuc) eotidf 
.": oca t ion 
corre sponc : : . ng 
to first 
amino acic 
residue oi 
smino acid 
b cquence 


An:no ac _c ffcment containing s:ana: peptide 
tA=Alar.i:ic, C-Cysteine, D=Asportic 7-.cid, L = 
Glutamic he id, F= Phenyl alanine , G=Glycint. 
tUHisiicme, 1= I soleucine, K-Lysine, 
L= Leucine, ^Methionine, N=Afpsragiric , 
P=Froline, c^Glutamine, R*Arginine, 
S= Serine , T=Threonine, V«Valine, 
VUTryptcrrhan, Y=Tyroeine, X«=Unknown, **Stop 
Codon, /-possible nucleotide deleticr. , 
Vpossdb".>e nucleoside insertion) 








DG y PLPTf DTS PliDGVDPDPAFFAAPMPGUCPAAGT YS YAQVSD 
YAGPPEP F AG PMHPRLG PEPAGPS I PGLI A P PSALH V Y YGAMGS 
PGAGGGRGrOMOPCHOHOHOHOHHPPGPGCPTPPPFALPCRDGT 
DPSOPAELlGEVDRTEFEOYLHFVCXPEMGLPYOGHDSGVNLPD 

SjHgaiss vys das sa vy y cny pdv 




2 


1484 


L7 LCGD 1 C"-MIACN AN N RAAH LE E FH Y QT KE DOS 1 L K S LHR ES S 
CQG FA WA TCLS TDhESQl>S VS CXC YEAANE I LQFRDi .KSQNPEH 
YVQVLKRKGN I RNE 1 GV F YMNQAAAIiQSEKI.V S KS VS AAEQQLW 

KKSFSCFEKGI H-^FESl edatnaalllcntgrlmricaqahcga 

GDEJ.KREFSPEEGLYyNKAIDYYLKALR^LGTRDIHFAVWDSVN 
WELSTTY FTMATL QQD YAPLSR KAQEQlEKEVSZAtW KS LKYCD 
VDS V S ARC p LCQ V RAAT 1 LASM YHSCUW OVGDE H LR KQHR 
VLADLHY6KAAKLF0LLKDAPCELLRVQLERVAFAEF0MTS0NS 
NVGKbKTL. c GAl>DIMVRTEHAFQLIOKELlEEFGQPKSGDAAAA 
ADAS P SLNR EEVMKLLS I FSSRLSFLLLQS 1 KLLSSTK K KTSNN 
IEDDTILKTirKHIYSOLLRATANKTATLLERlNVIVKLLGOLAA 
GSAASSNAVG 


6928 


1086 


777 


EA3 DLINKI I>OVK>mKRySVDKTLSHPWLODYQTWLDLRELECK 
IGFRYITHE^DDLRWEXyACEOGLQYPTnLlNPSASHSDTPETE 
ETEMKALGERYS1L 


6 929 


1749 


607 


RDORGYRDr.RSPAKEPGDVSARTRSGGGGGRSATTA^.PPPVPNG 
NLKQHDPO^LRHNGNVWAGRPSCSRGPRRAl OK POPAGGRRSG 
RGPAAGGLCiX5PrJGGTCVPKEPPVPPMDW£ALEKRLAGLOFRE 
OEVRNUGO AKTN S TS AQKNERES IRQKLAIjGS F FDDG PGIYTSC 
SKSGKPSLFHRLCSGMNLQ1CFVNDSGSDKDSDADDSKTETSLD 
TPLS PMS KO£ £ £ Y S DRDTTEEESESLDDMDFLTRQK Kl >QAEAKM 
ALA-M/vKPMAKvMOVKVEKQNRKKSPVADLLPKMPHISECLMKRSI. 
KPTDLRDM71 GQLOVlVNDbHSQlESLNEELVOLLLl RDEbHTE 
QDAttLVDl EDLTKHAESQOKHMAEKMPAK 


6930 


131 


545 


FKETANVFVSLFOKRMJFRHyFlEPSQLKLFYDVlTWlVTQVAl 
SYTWPFVLLS3KPSLTFYSSWYYCLHILG1LVLLLLPVKKT0R 
R KN7HEN I O I'SQS K K FDEGENSLGQNS FS TTNNVCNpNQE 1 ASR 
HSSLXQ 


6*31 


2 


6S9 


r^ERLPNRFACLLVASGAAEGVSAOSFLHCFTKASTAFNLQVAT 
PGGKAWEFVDVTESNAR WQDFRLKAYASPAJaES I PGAR YHALi 
LI PSCPGALiTrijASSGSLARl LiQHFHSESKPI CAVGH^VAALCC 

atnedrswvfdsysiJtgfsvcelvrapgfarlflwecfvkdsg 
acf5a.se pda vhvv/ldrhlvtgckasstvpavoni>lflcgsrk 


6SrJ2 




1133 


FVDcpGQGECAEEEEGGJQMNSRMRAHSPAEGASVESSSPGPKK 
SDMCEGCRS LAAGKPGY ISHDKSTSI KYVSHOHPSH POLFS I VR 
OACVRSLSCEVCPGRSGPIFFGDEQHGFVFSHTFFIKDSLARGF 
OR WYSI I Ti KMDR I YL7 NSWPFLLGKVRGI J DELQGKALKVFEA 
EOFGCPQRAQRKNTAF^PFLHORNGNAARSLTSLTSDDNLWACL 

WDNSEAEEEVKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRRKROVGGRGTAHHELRRRANHGLCLPTRILASGPSTL 
KTLQEVTDS L LGGWLMAQGVGGI 1 


€933 


1431 


890 


SLNLHCTLPi-FPHOYPAGYPSDKEGKKPKGOSKXOPSGTTKRPI 
SDDDCpSASKVYKASDSAEAIEAFOLTPOOOHLIREDCQNOKLW 
DEVLSHLVEGFNFLKKLEOSrTICVCCQELVYOPVTTECFHNVCX 
DCLORSFK^OVFSCPACRHDLGQNYIMIPNEILOTLLDLFFFGY 
SXGR 


6S34 


3030 


2588 


DRDHSQCGGJ KRVALARVSSVKLISKAKJRTVKMTPI J VTAFJV 
CWTPFFFVOHWSVWDANAPKEASAPIIVMLLASLNSCCKPWIYM 
LFTGKLPHELVOR FLCCSASYLKGRRLGETSASKKSNS SSFVLS 
HRSSSORSCSOPSTA 
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SLtt 


Predietec 
beginninc i 
nucleotide 
location 
corresponding 

smino acid 
residue of 
amino acic 
sequence 


'» rccicted end 
liucicot ide 
] oca tier: 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


kt,] no acirl secment containing signal pfcptiae 
(^.-Alanine, OCysteine, D-Aspartic Acic. Z- 
Glutamic Acid, FsFhenylaianine, G=Giye*nc, 
H-Hist icine, I =. I sol eucine , Kb Lysine, 
L -Leucine* M=Mer.hic;nine, N=Asparacinc . 
r rxcj^jic, v-vjj uLcitiune , k— at y J nine , 
f-Serme, T=Thrconine, v=Valine, 




VVTryptopnan, Y=Tyrosine, X=Unknown, *^Stcp 
Ccdon, /^possible nucleotide deletion, 
\- possible nucleotide insertion) 




686 


S43 


;I S ALY V AGGNDC-TS t LNSVERl S P KAG A WE S VAPMN I RRS TKDL 
VGVAVLELLNFPPPSSPTLSVSSTSL 






S67 


V. S KRRQ F LS 3ALLE F FG KS K P P P H RLFR K S LNVG LHY SH I P F LT 
T C I H KLR KJiLQKGE VGLS VETSKPQVP VGGLS RKKVPQE t'KATV 
KEXRLOEAQLYKEEC^QRYREGKYRDAVSRYHRALLO-LRGLDPS 
LP SPLPNLG PCGPALTPEQENI LHTTQTDCYNNLAACLLONE PV 
KYFRVREYSOKVLEROPDNAKALYRAC-VAFFHIX^DYDOAKHYLL 
AAVN R OP KDANVRR Y LO LTQS ELS S YKR K E KQLYLG M KC- 


e^37 ^ 


J 


72 7 


AVXFKCCPGRnPACFARGWRLDRVYGTCKCDOACRFTGDCCFDY 
DKJiCPARPCFVGEWSPWSGCADOCKPTTRVRRRSVOOEPQNGGA 
^CPPLtERAGCLEYSTPCJGQDCGHTyvPAFlTTSAFKKEKT^OA 
??-pHW5?THTH:DAGYCMEFKTESLTPHCALENRPI,lRWM0VXREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSIX;>iCTLHWQA3GNPRC0G 
TW KKVRR VDQCSCPAVHSFIFI 


6 £■ ?. 8 


3 


719 


NSR KLELAER VDTD FMULKKRRQSS EKEN DSGTLDT VG A VW'DH 
1 G NV A*\A VS SGGLAI .)CH PGR VGQAA L YGCGCWAENTG AHK P YS T 
A VSTSGCG EH LVRT2 LA3ECSHALQAEDAHQALLETMC-NKF1 £S 
IKlJXSEDGVLGGVlVLRiJCRCSAEPDSSONKOTLLVEFLWSHTT 
LSMCVCYMSAODGKAKTHJSRLPPGAVAGOSVAIEGGVCTnLGEP 
5. ELTLCAECEASORHFRT 




3 


810 


KVTAFRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFl-HSGGSS 
(-T ESLRRDSEATGSASSAPDSMSESGAASPGARTRSLKSPKKRA 
TGL0RRRIilPAPLPDTTALGRKPSLPG0W\T3LPPPLAGSl.KEPF 
r I KVYElDDVERL0RPRPTPREAPTQGIiACVSTRI>RLAERRO0R 
LREVQAKHKHLCEELAETOGRLMLEPGRWLEOFEVDPFLEFESA 
I. Y LAAI ERATAALEQCVNLCKAHVKMVTCFDI SVAASAAI PGPQ 
fVDV 




11 88 


496 


C- KKAAOFLRHRSRCATPPRGDFCGGTERAlDOASFTTSMZWDTO 
WKG$SPLGPAGLGAEEPAAGPOLPSWLOPERCAVFOCACCHAV 
LADSWLAWDLSRSLGAVVFSRVTIWVVLl-APFr.VGlEGSLKGS 
Ty7CLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKKVCYLL 
KT>CAlVNASE?4D10N\'PLSEKlAELXEKlVLTHNRLKSLKKaLS 
F.VTPD2SKPEN 




1 


713 


i: ^5r? JOuJbSJr KVarni \,v>ri V L>r*V 1 i ^ L>N V UAJj/vtAVKVM r.Hi-Aj X v*t 

yvls aamogdvxsmac fygllahvartrlt p s mag as v e e d ac*l 
>:ei^aeu?3pdloleealetmawgrgpvcllaggeptvologsg 
?-ggrkoelalrvgaelrrmplgpidvlflsggtdgodgpteaag 

A.WVTPELASQAAAEGLDI ATFLAHNDSHT PFCO^QGGAHELKTG 
KTG77JVMDTHLLFLR PR 


6942 


3 


24 6 


C-DyVERyDPKTDTWTMGAPLSKPTNAVGGCLLGDRLYADGGyDG 
CTYLNTMESYDPOTNEWT0MASLN2 GRAGACVWI KOF 




1 


73 9 


FMATGDGAKTLA1 HVKALTADS 3 R 1 TWKATLP AS S FR LSKLRUG 
HS PAGGS1TE7LVQGDKTEYLLTALEPKPTY 1 1 CMVTMETTNAY 
V ADE T P V CA KAETADS Y G PTTTLN QEQNAG PMAS L PLAG 1 1 GGA 
VALV F LFLVLGAI CWYVHQAGELLTRERAYNRGS R KKDDY KESG 
T KKDWS I LE I RG PGLQMLP I N P YRAKEEYVVHT I FPSKCSSLCK 
ATKT 1 G YGTTRG YUDGG I P D I DYS YT 


6S44 


960 


156 


w-n i li ,ngv ky es eltgs £ eraeqpls vgrlcst i cnm p kalrt 
lc\w:fix3wlsfex^llfytdfwgewf^dpxapht£eayoky 
ksgvtmgcwgmciyafsaafysailekleeflsvrtlyf:ayla 

r*G LG TGLATLS RNLYWLS LCI T YG I LFSTLCTLP YS LLCD Y YQ 
S K K F AGSSAJX3TS RGMG VD I SLLS CQ YFLAQ 1 LVSLVI. G P LTS A 
VGfAKGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
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S£0 
ID 

NO: 


predi cieri 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sccuenct- 


Predicted enj 
nucl eotide 
1 oca t ion 
cci responding 
tc first 
arT.ino acid 
residue of 
an.ino acid 
cecuence 


Arr>inc acid segment ccntainanc sicnai pept xai 
(A*Ai&nme, C-Cystcine, DsAsctrtic Acid, E- 
Glutstr.ic Acid, F* Phenylalanine , G~ Glycine, 
H^His Lidine, J = 1 sol eucine, K=Lysi:>e, 
L- Leucine, M-Methionine , N-Asp&ragine , 
P=Prdine ( Q=Glu famine, R^Argjinine, 
S?Senne, T=Threonine, v»Valine, 
WsTryptophan, Y*7yrosine, X^Unknovn, *=Stop 
Codou, /^possible nucleotide deletion, 
\=pOGrible nucleotide insertion) 


6 945. 


2067 


275 


f EGEDRGLPR'J'«GAALG7GTKLAPWPGRACGAlyPRWTPTAPAC)GC 
KSKPGPAKPVPLKKRGYDVTRNPHLNKGMAFTLEERLQLGlHGb 
1 PPCF1 SQDVOL»LRlMHVYERCQSDLDJCyi 1 LMTLODRNEKLFY 
RVLTSDVEKFMPIVYTPTVGI.AC0HYGLTFRRPRGLF1TIHDKG 
HLATMLNSWPEDNI KAWVTDGERI LGLGDLGCYGMGI PVGKLA 
LYTACGGVN POQCLPV LLDVGTNNEELLRDPLY 1 GLKHQRVHGK 
AYDDLLDEFMOAVTDKFGINCLJQFEDFAKAWAFRLLWKYRNKY 
CM FN DD 1 QGTASVAVAC- 2 L AALR I TKN KLSNHV FG FQG AGE AAM 
G \ JAHLLVMALB\ KEG VPKA \ EATRKI W\MVPF \ KGLI VQGRDH 
bNHEKE.MFAOD\HPEVNSLE£VVRLVKPTAllGVAATAEA\FTE 
C2LRD^SFHERP\I 1 FALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS\G t JPF*GVLIWEMGXTFIPGGRGNNA*RVPRGWQLGVHSPG 
GDPGH I P\DE 1 FLPDSRAKIjPQEVSEQHLSCGRLY P\PLST\1R 
NVFLR I A I KV FD * G YKRNLV \S YY PEPKD \ KEAFCK I PGS YTPD 
YDSFYT/VDSY1WACGXAMNVQTV 


6946 


13?. 


2551 


SCEYSGJTVAPGDPCPGVAHLIiAPSMASDTPESLKALCTDFCLR 
NLDGTLGYLLnKSrLRLHPDIFLPSElXCDRbVNEyVElyVNAAC 
NF\EPHE\SFFNPLFRDPRKUPASRRIHL\RED\LVQD\QD\LE 
AIRKODLAVEL\YLTN\CEKLSAKSL0TL,RSFSHTLGVP*AFPG 
C\TNILLLRKEMPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRbR 
F\ LKLG RMI DWVP VES \ LbRPLNSLAALDLSG 1 QTS DAA\ FLTQ 
WKDSL\VSLVL\YNMDLfDDlUR\VIVOLHKLRHbDISRC>RbSS 
YYKFKLTREVLSLFVQKLGr^MSLDISG\HKlLENCSISKlGKR 
KAGQTS1 \EPSK\SS J 1 PFRGFEGGPLQF\LGVF*GI FCGRLTH 
I FA YKVSGDKNEEOVLNAI E A Y TEHR PE I TS RA I N LL FD JAR I E 
RCTgOLLRALXLVITALKCHKYDRNlQVTGSAALFYbTNSEYRSE 
OS VKLK RQV1 Q WENGMES YQE VTVQRNCCLTLCNFS I PEELEF 
OYRRVNEbLLS 1 LNPTRODES 10R1AVHLCNALVC0VDNDHKSA 
VGKWGFWrMLKLTOKKLLDKTCDQVMEFSWVSAbWNlTDETPD 
NCEMFLNFNGMKbFLDCLNEFPEKOEbHRN^bGLbGNVAEVKEL 
RPQLMTSOFISVFSNLLESKAD3IEVSYNACGVLSHIMFDGPEA 
WGVCEPGREEVEER.MWAAlOSWDlNSRRNlNYRSFEPILRLbPQ 
G 1SPVSCHWATWALYNLVSVY PDKYCPLL1 KLGGMPLLRDI I KM 
A TARO E T KEMAR KV I EH CS N FXEENMDTS F 


6947 


2 


1682 


TSVSTIPRGLASARPOSFKSWRCCPVWRRSPGRARGRGl.KMbNVP 
SQSFPAPRSQQRVASGGRSKVPLKOGRSLMDWJRLTKSGKDLTG 
bKGRLlEVTEEELKKHNKkDDCrflCIRGFVYNVSPYMEYHPGGE 
DELMRAA^SCGTELFDOVHRWVNYESMLKECLVGRMAIKPAVLK 
DYREEEX XVLNGMbPKSOVTDTLAKEGPSY FSYDWFQTDSLVTI 
/ EH I Y * TEG YQ FR LNNS * S S E * FL YS RNNY • GbbI S YTYW/R * A 
KRFRXI FbCGL/CESVGKlEl VbQKKENTSWDFLGHPLKNHNSL 
IPRKDTGLYYRKCQLISKEDVrHDTRbFCLMLPPSTHLQVPIGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIX 
lYPTGLFTPELDRbOIGDFVSVSSPEGNFKlSKFQEbEDLFLIiA 
AGTGFTrMVKILNYALTDl PSLRKVKLMFFNKTEDDJI WRSOLE 
KLAFKDKR LDVEFVbSAP 1 SEWNGKQGH1 SPALLS EFbKRNLDK 
SKVLVCJ CGPVPFTEOGVRLLHDLNFSKNEIHSFTA 


6948 


104 


58 


PDGA^SFFPDEYFTCSSLCLSCGVGCKKSMlWGkEGVPFlEAKSR 
CRYSH0YDNRVYTCKACYERGEEVSWPKTSASTDSPWI4GLAKY 
AWSGYV1 E CPNCGVVYRSRQYWFGNQDPVDTWRTEI VHVWPGT 
DGF^KPN^AAQRLl^^TMAQSVSELSLGPTKAVTSWLTDQI 
APAYWRPNSOIbSCNKCATSFKDNDTKHHCRACGEGFCDSCSSK 
T3 PVPER GWGPAPVRVCDK CY E AR/TRPVSC Y RGTSGR * RRRRT 
OETVE 


6949 


152 


4656 


GLRLCLSK PLTRPGDDS VGGSAMASGAGGVGGGGGGK 1 RTRRCH 
QGP I KP YQOGRQQHQGI LS RVTESVKNI VPGW LQRYFNXNEDVC 
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SEQ 
ID 
NO: 


Pred: cted 
beginning 
nucleot ide 
1 ccati on 
corresponding 
tc first 
am:no acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat aon 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine ;;cid sccment cc\'iU:ninc signal peptide 
(A-Aleninc, C=Cystciiic, D-Ascartic Acid, E= 
Glutamic Acid, F-Fhenyj t janine, G=Glycine, 
H = Hist iciine, 1-Isol eucirie, K-- Lysine, 
b=Leucine, M=Methi cnint , N-Ar.r.aragine, 
P^Proline, 0=Gluta:nine , R^Arcinine, 
S = Serme, T=Threonine, V^Valme, 
V?= Tryptophan, Y=7yrosint , X=Vnknown, *=Stcp 
Codon, /=possib3e nucleotide deletion, 
N^poer i ble nucleotide zr.sert: on) 








SCSTDTi EVPRWFENXEDHLVYATEESSKJTDGRITPEPAVSNT 
EEPSTTi'TASTWPDVLTRVSLYKSKLNKSMI.ESPALHCQPSTS 
SAFPlGS'SGFSLVKElKOST£Q«rDDNJ£TTSGFSSRASDKDIT 
VSKNTSLPPLWSPEAEUCHSLSOHTATSSKXPArNLSAFGTLSP 
SLGNSS3 LXTSQLGDSPFYPGXTI YGGAAAAVRQSXLRNTPYQA 
PVR K QMKAKQLSAQS YG VTS STAF K I LOS LEKMSSPUkDAKR I P 
£1 VSSP1 :KS PLDRSG 1 D I TD FQAK E EK VDSQ Y PPVQRLMTPX PV 
SI ATNRSVYFKPSLTPSGEFRKTNORIDKKCSTGYEKNMTPGQN 
KEORESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEKFG PVLPKI SLF I TSSSLFTFNFSSPE I TTSS PS PI NSS 
OALTNKVGKTSPSSTGSPMrKFSSriVKSTFANVLPPSSlGFTF 
SVPVAKTAEI^GSSSTLEPIISSSJ^HVTTv'NSTHCKKTPPEDC 
EG PFRPAE3 LXEGSVLD3 LKSPGFAS PX1 I). c VAAQPTATS PWY 
7RPAISEFSSSG1GFGESLKAGS5W0CDTCI.L0NKVTDNXCIAC 
OAAKLSFKDTAKOTGIETPNXSGKTTLSASGTGFGDKFKPVIGT 
WDCDTCLVCNXPEA1XCVACETPKPGTCVKRALTLTWSESAET 
MTASSSSCTVTTGTLGFGDKFKKFJGSWECSVCCVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVS1! SGGSI^LEKFKKPEGIWDC 
E LCLVQN KADSTXCLACES AX PGTKSG FXG FDTS S S S S NS AAS S 
SFXFGVf F?SSGPSQTLT£TGNFKFGDQGGFX3GVSSDSGYINP 
MSEGF* FF.XHIVGFKFGVSSESXPF EVXXDGKNDNFKFGLSFGl* 
SNPVKl.lPFOFGVSNIXOEEKXEF.bljXSSC/iGFRFGTGVINSTR 
VPAHTIV'l-.'iCNXSSFNLGTIETKSVSVAPLKCOTSEAKKEEMPA 
TKGGFSFGNVKPASLPSASVFVLGKTEEXOCEPVTSTSLVFGEG 
XLTMXEPKC\OPVFSFGEFORQTXTJENSSKSTFSFSMTXPSEKE 
SEOPAXA7TAFGA0TNTTAD0GAAKPDLSYLNNSSSSSSTPATS 
AGGG\ 1 FGCSTSSSHPPVATFVFGC^ SNPGSSS\AFGNTAESST 
SOSLLFSODSKLATTSSTGTAVTPn/FGPGASSNNTTTSGFGFG 
ATrrSSSAGSSFVFGTGPSAPSASFAFGANCTPTFGOSQGASQP 
NPPGFGS1 ;:SSTALFPTGSQPAPP:TGTVS£SSOPPVFGOQPSO 
SAFX5SGTTPNSSSAF0FGSSTTNFNFTNNSPSGVFTFGANSSTP 
AASAQPSG SGGF P FNQS PAAFTVGS NGKNV FS SSGTS FSGR K I X 
TAVRRRK 


6950 


2585 


4: -j 


PRPGSRSGliCH RAGERGAVRAGGLS n RTRAE * 3 MDELH YQDTDS 
DVPEORDAXCXVXWTHEEDEOLRALVROFGOODWKFLASHFPNR 
TDQQCQYk WLRVLKPDLVXGPWTXc CQQKV7 ELVXKYG TKQ WTL 
I AKHLXGK : .GKOCRERWHNKLNPE VK KSCWTEEEDR 1 1 CEAHX V 
LCNRWAE 1 AXMLPGR TDNA V KNKKNS TI XR X VDTGG FLSESKDC 
XPPVYLLLELEDKDGLQSAOPTEGOGSLLTNWPSVPPTIKEEEN 
S EEELAAATTSKEQE P I GTDLDAYKTPEPLEE FP XREDQEGS P P 
ETSbPYKWWEAANLLI PAVGSSLSEALDL I ESDPDAWCDLSKF 
DL?EEPSAEDSJNNSLVOLOASHQOOVLPPROPSA\LVPSVTEY 
RLIX>HT3SDLSRSSRGELI F3 SP5TEVGGSGI GTPPSVL»KRQRX 
R R VALSPV TENSTSIS FLDS CNS LTr KSTF VXTLPFSPSQFLNF 
VWKQDTLE LESPSLTSTPVCSQXVVVTTPLHR DXTPLHOXHAAF 
VTPDQKYSKDNTPHTPTPFX^ALEKyGPIiXPLPQTPKLEEDLXE 
VLRSEAG3ELIIEDDIRP5XQXRXPGLRRSP1XXVRKSLALD1V 
DEDMKLMy.STLPKSLSLPTTAPSNSSSLTlSGIXEDNSLLNQGF 
LOAXPEXAAVA0KPRSHFTTPAPMSSAWXTVACGGTRDQLFMOE 
XAROLUSR 1 XPSHTSRTL I LS 


6951 


i94 0 


23<r 


AG PDDTttKK SLQAL YCQLLS FLL 3 1A LTEALA FA IQEPS PRESL 
OVLPSGTPFGTMVTAPHSSTRHTSW^MLTPNFIXSPPSOAAAPMA 
TPTPRAEGKPPT\TP$PPSLRC* PPPI LXAF/SSTGPAPAAMAT 
TSSXPEGR PRGQAAPT I LLTXPPGATS RPTTA P PRTTTRRPPRP 
PGSSRXGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPIjGQXRP 
LGXI FOI Y KG WFTGSVEPEPSTLT P RTPLWG Y S SS PQPQTVAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuclfot idr 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A-Alamne, C=Cysteine, D=Ar»partic Acid, L= 
Glutamic Acid, F=Fhenylalanme / G=Glycint.. 
H=Histidine, 1= Jcoleucine , K=Lysine, 
L=beucine, M-Met hi onine, N=Asparagine, 
P=Proline, Q-Glutamme, R^Arginine, 
S=Serine, l=Threonine, V-Valine, 
W=Tryptophan, Y=* Tyrosine, X=Unknown. *^Stop 
Codon, /^possible nucleotide deletion, 
\»poeoible nucleotide insertion) 


— 






TVPSNTSWAPrTTSLGPAKDKPGLRRAAOGGGSTFTSQGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 
S RPLSTSSGVFTAATGPTPAAFDTS VSAPSQGI PQGASTTPQAP 
THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPOVJYPQpoAIS 
STAWSPPGPGSLG0OGTSPMWPRGTNRSTEPPSA*ARV^ISFG*S 
WP S ACPSP P\ LC P ADG VI J3EEEEEDRQPGEQPEA YGNNTKK PGT 
TFCK?AC\RGAAPGEJPVPLKPLRTQLSEPRSPANGDYRDTGMVP 
C 


6952 


658 


304 


PESEGESGEKiTDRYTJHSQLEHl^SKYlGTNATPTPPSGSGNCE 
PTPRLVLLl»HGPLRPSQLLRHCGE*EQSASPIiOLDGKDASALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCOSS 


6952 


1515 


34 5 


NWGKTRAJLASGKHVPFGKQTNPNKS/VUCDS *G* * RRETT^DES 
FS PH FRGKMGGW \ KLEXELENTEQPVGGNEG * E HE VTGNLNS D 

GRALSSPGSLGRHLLIHSEDCRSNCAVCGARFTSHATFNSEKLP 
EVLNMESLPTVHNEGPSSAEGKD1AFSPPVYPAG1LLVCNNCAA 
YRKLLEAQTPSVRKWALRRONEPIjEVRLORLERERTAKKSRRDN 
ETPEEREVRRMRDREAXRLORKOETDEORARRLORDREAMRbKR 
AIETPEKRQARiaREREAKRLKRRLEKMDMMLRAQFGQDPSAMA 
AIAAEMNFFQIjPVSGVELDSQLLGKMAFEEONSSSLH 


fe~95T~~ 


819 


1 


PPPPPI I PSHPREAGT*AG* KRSGDSECSPPVEQ*A* TRAAAUN 

* PQR* RWTEGNSPQA5AVATPGCX3ASPAAPRCTP* PSRRHRRLP 
PGAJRPPAG* AAPAPTKPWLAGPASAPOPGAAPLSPPAPPLIRTR 

* t— * /■> * »k »i r>r»« , » nnnr\DCnn r>DTfv l r'(^ciiit , n'nDTT)T)R\fC 7\ c* Is ftfpnc 

* CAGAAANbKrKKUK.brKFKI i J OljL^Wi>c.* J K J FrAvb/ioAVi J'i 

UAG*AGGR*GQRQRPSTGR* PPGVGGAGRSHRREGT3 PGNPHPR 

as*ragwqr*pgp/rewgl*epcgeemsgpggpggaffnqvgss 
vmqamstgi , 


6 955 

I 

i 

i 

1 


1966 


782 


P PGR ROVRAQVAGAPVGHWGTKARQVXTGGRRRAR RTMPF2/GQD 
WRSPGWSW1KTEDGWKRCESCSQKLERENNHCN3SHSI1LNSED 
GE I FNN EEHE YAS KKR KKDHFRNDTNTOS FYR EK WI y VHK BSTK 
ERHGYCTLGEAFN R LDFSSAI QDI RR FNYWKLLQL I A KSQLTS 
LSGVAOKNYFNlLDKIVOKVLDDHHNpRLIKDLLQDLSSTIiCIL 
/N *RSREVCI SGKRQYLDLP1 RNYSRLATTATGSSDD* ASE\NG 
LTLSDLPLHMLNN I LYR FSDGWDI I TLGQVTPTLYMLS EDRQLW 
KKLCOYHFAEK0FCRHL1LSEKGH1EWKLMYFAL0KHYPAKEOY 
GDTLHFCRHCS I LFNKDSGHPCTAADPDSC FTP VS POH FI DLFK 
F 


| 6956 
1 


8605 


3839 


QTSTS1 FASPTSPPVLGESVI>QDNSFDl>NNGSDAEOEEMETCSS | 

DFPPSLTQPAPDQSSTIOLHPArSPAVSPTTSPAVSLWSPAAS 

PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

TSPKASPVTSPAAAFPTASPANKDVSSFLETTADVEEITGEGLT 

ASGSGDVMRRR1ATPEEVRLPL0HGWRREVRIKKGSHRWQGETV? 

YYGPCGKRMKQFPEVIKYLSRNVVHSVRREHrSFSPRMPVGDFF 

E E RD TPEG K?W VQLS AEE I PS R 1 0 AI TGKRGR PRNTE KARTKE V 

PKVKRGRGRPPKVKITELLNKTDNRPLKKLEAQETLNEEDKAKI 

A K S K KKMR QKVORGECQTTI QGQARNKR XQE TKSLKQKEAKXXS 

yj^EKEKGKTKOEKLKEKVWEXXEKVKMXEKEEVTKAKPACKAI) 

KTLATORRLEERQRQOMILEEMKKPTEDMCLTDHQPLPDFSRVP 

GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAXDVPSLGVIjOEGL. 

LCQGDSLGEV0DLLVRLLKAALKDPGFPSYC0SLKILGEKVSE1 

PLTRDNVSE I LR CFLMA YG VEPAI/CDRLR TQPFQAQP PQQ KAA V 

LA FLVHELNGSTU 1 HE I DKTLESMSSYRKNKW I VEGRL.RRLKT 

VLAKRTGRSEVEMEGPEECLGRRRSSRIKEVTSGMEEEEEEES3 

AAV PGRRGRRDGEVDATASS I PELERQI EKLS KRQLFFR KKLLK 

SSgMLRAVSLGODRYRRRYWVLPYLAGIFVEGTEGNLVPEEVIK 

KETDSLKVAAHASIJ^PAl>FSMKMElJ\GSNTrASSPARARGRPRK 
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srso 

ID 
KO: 


Predicted 
bee inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ( 


I-recictec end 
nucleotide 
j oca t ion 
corresponding 
ic first 
f.mino acid 
residue of 
c.mino arid 
. c equencp 


Amine ncio seoment containing sicnal peptide 
(A-Alanin&, C^Cysteine, D=Aspartic Acid, t- 
Glutamic Acid, F-Phenylslanine , G=C-lycine, 
tt^Histidine , l-lsol eucine, X=Lysint , 
l>^Leucine, M-Nethionine, K=Asparat-i ne , 
P=?roline, O-Glu canine , R=Arginine , 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X* Unknown , *^Stcp 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) j 








TK PGSMQPR HLKS P VRGQDSEQPQACLQPEAQLHA P AQPQFOLQ 
LOLOSHKGFLEOEGSPbSLGOSOHDI.SOSAFLSWLSOTQSHSSL 
IjSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSP001ASSKPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPOSPTGLGOPKKRGRPPSKF 
FXOMEQRVLTObTAQPVPPKMCSGWWWlRDPEMLCAMl.XALKPR 
GlREKALHKWLNKHRDFLOEVCLRPSADPlFEPRCLPAFOEGlfi 
SVJSPKEKTVETD1AVL0WVEELE0RV1MSDLQIRGWTCPSPDST 
REDLAYCEHLSDSOEDITVJRGRGREGLAPQRKTTKPLDLAVKRL 
TJU^EONVERRYl.REl'LWPTHEWL'tiKALLSTPNGAPEGTTTEIS 
Y El TPR I R WROTLEPXRSAAQVCLCLGQLERSI AWEKSVNKVT 
CLVCRKGDNDL\-LLLCDGCDRGCHIYCKRPKMEAVPEGDWFCTV 
CLAQQVEGEFTCKFGFPKRGOKRKSGVSLNFSEGDC-RRRRVLLR 
GRESPAAGPRYSL'EGLSP^KRRRLSMRNHHSDLTFCEIILMEME 
SHDAAWPFLEPVNPRLVSGYRRIIKNPKDFSTMRERLLRGGyTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFKE\SRWEEF 
YCGKOGCSVROGRWGVTI J WHLPPTK0TKTCHFHLLKl>PWV0T0V 
RYNPDF 


6^7 


82 


3bJ4 


KLJVAMPEPTXKEENEVPAPAPPPEEPSKEKEAGTTPAKDWTLV 
ET P PGEEOAKQN AN SOLS I LF 1 EKPOGGTVK VGED ■ T F I A KVKA 
EDLS EKPTI NGf. R KWMDLAS KAGKH UQL KETF ERK5RVYTFEKQ 
1 1 KAKDNFAGNYRCEVTYKDKFDSCSFDLEVHESTGTTPNID1 R 
S A FKRS G EG 0 ED AG ELD F S GLLKRR E VK 00EE E PQ V DV WE LLK N 
TKPSEYEK3 AFOYESPTCSGMJjKRLKRSIREEKKSAAFAKILDP 
VYCVDKGGRVRFWEliADPKLEVKWNKNGOELRPSTKYIFEDTR 
COSlLNIDNCOMTDDSEYYVTAGDEKCSTELbVREPP^MVTKCL 
EDTTDYCGERVELECEVSEDDAOVKWFKNGEEIILVOTRYRIRV 
EGKKHI LI 1 EG7vTKADAADY£VMTTGG0SSAKLSVDLKPLKI LT 
PLTDQTVWI^KEICLKCEISENIPGKWTKNGLPVOESDRLKWH 
KGRIHKLVIDHALTEDEGDYWAJPDAYWTLPAXVHV2DPPK1I 
LDGLDADN-rVTVIAGNKLRLElPISGFPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTh:VGDDWCIMNWEPPAYDGG£P3I>GYFIE 
KKKKQSSRWMRLWFDLCKETTFEPKiaoi EGVAYEVR 1 FAVNA\ I 
GISKPSMPSRPFVP1JVVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDGYVLEYCFEGSTSAKOSDENGEAAYDLPAEDWIVANKD 
LI DKTKFTI TGLPTDAKI FVRVKAVNAAGASEPKYYSQPl LVKE 
1 1 EPPK1HSPKKLKOTY1 RRVGDRV1 LVI PFOGKPR PELTWXKD 
GAEIDKNOl N3 RNSETDTI 1 Fl RKAERSHSGKYDLOVKVDKFVE 
TAS1 Dl R 1 3 DRPG P POl VK I EDVWGRNVALTVTTPP KDDGNAAI T 
GYTlOKADKKSKF.WbRVIEHIlEPVFHTELVIGNEYYFRVFSEN 
MCGLSEDATMTKE S AVI ARDGKI Y KNP VYEDFDFSEAPKFTQPL 
VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLE I R KPS P YDGGTYCCKAVNDLGTVE I ECKLEVKV I AQ 


69S8 


274 


1663 


PRTSRVKTEGSOGSSAMDFSVKVDIEKEVTCPICLELLTEPLSL 
DCGHSFCOAC1TAKIKESVIJSRGESSCPVCOTRF0PGNLRPNR 
HLANI VERVKEVKMSP0EG0KRH VCEKJiGKKLQ I FCK EDGKVI C 
^CELSOEHOGKOTFRINEVVKECOEKLQVALQRLIKENOEAEK 
LEDDI RQERTA WKN Y I Q I ERQK I LKG FN EMR V I LDNEEQR ELOK 
LEEGEVNVLDNLAAATDOLVOQRODASTLISDbORRLRGSSVEM 
L0DVIDVMKRSESVJTLKKPKSVSKKLKSVFRVPDLSGMLOVLKE 
LTDVQYY WVDVMLN PGSATSNVAIS VDOROVKTVRTCT FKNSNP 
CDFSAFGVFGC0YFSSGKYYV9EVDVSGK I ANILGVHS XI SSLNK 
RXS SGFAFDP SWY S KVYSR YRPQYGYWV I GLQNTCE YNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 




a 


146$ 


SLVHWEFGRG2 EDFPYLFFOLTHCOOR ICSVTQAGVQWCDHSS 
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SEQ i 
JD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino «cid sec-mfnt containing sicnr.i peptide 
(A-A-Unme, C=Cyr,teine , D-Aspartic Acid, 1-- 
Glutamic Acid, F- Phenyl alanine , 0 = G lycine , 
K=Hi st i dine, J-l r.oleueine, K=Lys:nc, 
L=Leucme, tf=Met hionine , N^Asparoa; :ie. , 
P^Proline, 0=G2ut:amine, R=Arginint . 
S-Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknot , *-Stcp 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LOPQTPGLNQSSHl.SULSSfUDYRML^SFNEWFWODRFWLPPNX'']- 
WTELEDPDGRVY PH PODLLAALPLALVblAMRLAF ERFI GLPLS 
RWLGVRDOTRROVKPNATLEKHFLTEGHRPKEP&LSbLAAOCGL 
TbQQTOR W FRRRRNQDR PQLTKKFCEASWRFLF Y l S S FVGCLS V 
LY HESViLM APVMCWDR Y PNQLTL>SCPAADSEA\ S 1 YWWYULELG 
FYLSLL1RLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKE0V1H 
H F VA V 1 LM TPS Y£ ANL I ->R 1 G SLVLLhHDS S VYLL E A C KM VNYMC 
YQQVCDALFM FSFVFFYTRLVLFPTOl LYTTYYES I SNRGPFF 
GYYFFTgGLLMIiljCLLHVFWSCLILRMLYSFMKKGtiKEKDlRSDV 
EESDSSEEAAAAQEPLOLKNGTAGGPRPAPTDGPKSRVAGRLTK 
KHTTA" 


6360 


387 


2068 


AKWAREKKMOEF\TRSFF\RGRPDLSTLTHSI VRFvRYLAHSGKS 
.HL EPEEK QA LKR1»V£ E E PLKMQVDEhftSR EDK LDITKKGKRPP7 
PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTKF 
KEENPRRA\SKAVEESSDEERQRDLPAORGEESSEEEEKGYKGK 
TRKKPWKKOAPGKASVSRKQAREESEESEAEPVC^TAKKVEGN. 
KGTKSLKESEQESEEEI LAQKKEOREEEVEEEEKF EDEEKGDWK 
PRTRSNGRRKSAREERSCKQKSQAXRHiGDSDSEE EOKEAASSG 
DDSGRDR E P P VQRK S ED RTQL.KGG KRL SGS SEDELDSGKGE PT A 
KGSRKMARLGSTSGEESDLEREVSDSEAGGGPOoF KKNRSSKKS 
5RKGRTRSSSSSSDGCPEAKGGKAGSGRRGEDHFA\^IRLKRY1R 
ACGAJIRNYKKLLGSCCSHXERbSILRAELEALGWKGTPSLGKCR 
ALKEQREEAAEVASLDVANI 1 SGSGRPRRRTAKXFLGEAAPPGE 
^RRTbDSDEERPRPAPPDWSHMRGIlSSDGESN 1 


6961 


340 


1646 


RPWSSPTMKPNFSLR1/R 3 FNLKCWG1 PY1>SKKRADKMRRLGDFL 

kqesfdlalleevwseodfoylroklsptypaa:-?^frsgi 1GSG 

LCVFSKHP2 0ELTQK1 YTLNGYPYMJHHGDWFSGKAVGLbVLHL 
SGWVLNAYVTHLHAE Y N RQKDI YLAHR VAQAWELAC FI HHTS KK 
ADVVLLCGD^NMHPEDLGCCLLKTWTGLHDAYLETKDFKGSEEG 
NTMVPKNCYVSQOELKP FPFG VR I D YVLYKAVSG FY 1 SCKSFET 
TTGFDPKRGTPLSDHEALMATLFVRHSPPOONPSSTHGP\AERS 
PL/MCVGLKEALDGSLGLGMA\OAR WWA\TFA \S V V i GhGL\ hL 
LALLCVLAAGGGAGEAA1 LLWTPS VGLVLWAGAFY LFHVQEVNG 
LYRAOAELOHVLGRAREAODLGPEPOLYALL\LGGv'EGDRTKEG 


6962 


346 


1646 


RPWSSPTMKPNFSLRXR1 FNLNCWG1PYTjSKHRAI?RWRRIjGDFL 
NQESFDLALLEEVWSEODFOYLROKLSPTYPAAHHrTJSGIlGSG 
LCVFS Ki i P I QE LTQH 1 YTLNG YP YMIHHGDWFSG K^. VGLLVJLHL 
SGMVIjNAYVTH LHAE YNRQKD I YLAH R VAQA WE LAGF 1 HHTS KX 
ADWULCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMV P K NC Y VSQOELK F FP FG VR ID Y VLYKAVSG r V I SCKS FET 
TTGFDPHRGTPLSDHE/aKATLFVRH S PPQQNPS S THGP \ AERS 
PWMCVCLKEALIX5S1X?U;MA\QARWWA\TFA\ S YVIGLGlA UL 
LALbCVlJAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVOEVNG 
LYRAOAEL0KVLGRAKEAQT>LGPEPOl'YALL\LGOGEGDRTKEO 


6963 


374 


2618 


RVTPL 1 L.KLLKKPKTAENQKASEENE I TOPGGS S A < PGL PCLNF 
EAVl>SPDPAl.lHS'rHSLTNSHAHTGSSDCDISCKGVlTERlMSIN 
LHNFSNSVLETUJEQRNRGHFCDVTVRIHGSMLRA.ORCVLAAGS 
PFFODXLLLGYSDIEIPSWSVOSVOKLIDFMYSG\T.RVSC)SEA 
LOILTAASILOIKTVIDECTRIVSONVGDVFPGIODSGODTPRG 
TPESGTSGOSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHE?ALGLPRDHHMEDPSWITRJKEKSOO«BRy^ 
STTPE7THCRKQPRPVRIQTLVGNIH1XQEMEDDYDYYGQQRV0 
ILERNESEECTEDTDOAEGTESEPKGESFDSGVSSSIGTEPPSV 
HTQQFG PG AARDSQAEPTQPE0AAEAPAEGGPOTO01 ETGASS PE 
RSNEVEKDSrVITVSNSSDKSVLCXJPSVNTSIGCPLPSTOLYLR 
QTETUTS NLRMFLTLTS NTQV IGTAGNTYI.PALFTTOPAGSGPK 



578 



BNSDOCID: <WO 0l53312A1_t> 



WO 01/53312 



PCT/USOO/34263 



SEO 
ID 
NO: 


Predicted 
beginninc 
nucleot ide 
iocaticr. 
corres ponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first, 
amino acid 
residue of 
amino acid 
secfufcnee 


Amino acic segment containing signal peptide 
(A=Aia:une, OCystcine, D=Aspartic Acid, ¥.. 
Glutamic Acid, F-Fhenylalanine, G^Glycint, 
H=Kistidme, 1=1bc) eucir.e, K=L-ysane, 
L~ l^uo nt , M=Meth3 on i ne, N-Asparagine , 
P=Proiine, 0=Glutanine, R=Arginine, 
S=Serine, T«Threon:ne, V=Valine, 
W»Tryptophan, Yr-.Tyrosine , X^Unkncwn, *»Stop 
Codon, /=po9aib!e nucleotide deletion, 
\=possibie nucleotide insertion) 








PFLFSLPCrLAGQOTOFVTVSOPGLSTFTAOLPAPOFLASSAGH 
STASG0GEKKPYECTLCKKTFTAK0NYVKHMFVHTGEKPKOCSI 
CWRS FSLXHYL! K\HMVTHTGVRAYQCS I CNKR FT0KSSLNTVHM 
RLKRGEKSYECYICKKKFSHKTLLERHV/ALHSASNGTPPAGTPP 
GARAG PPGVVACTEGTTY VCS VCPAKFEQI EQFNDHMRMH VSDG 


6964 


3 


178 • 


SGRFFFFFFSN7DVYF3KKVTNRWTAGSSYKM7RMKSIGKILLL 
Q1FIG\NCSMFVLVI 


6965 


75 V 


208 


NVF3EPK1CGKMKTSAHPG0KHPDFSMGLLFPLLAALEVCSCGS 
SGSLGYNLPCNH\GLI J GKNTLVLLG0MRR1SPFLCLKDR^DFRF 
P0EKVEVSCL0KA\OAMSFLYDVLQOVFNFSHKALL\CCMEHDL> 
PGPTFHFTSSAAGTPGDLLGAGDGRRRSWGOWVIEGSTLALRRY 
F0ES2STLE 


6966 


82 0 


1867 


IlTALGVRGMPGCPCVGCGl-VvGPRLl,FLTAl>Al J ELLGRAGGSOP 
ALRSRGTATACRLDNKESESWGALLSGERLDTWICSLLGSLWVG 
LSGVFPLLVjPLEMGTMLRSEAGAWRLKOLLSFAI^GLLGNVFL. 
HLLPEAWAYTCS AS PCGFGQSLQQQQQLGLWV 1 AG 1 1 .TFLlALEK 

/hvpgoogggdopgpoorphcccrraqwrplsg pagcrar prcr 
gp\d: KVSGYLNLLANTI dnfthglavaasflvskkigllttma 

ILL1IE1 PHEVGDFA1 LLRAG FDRVJ S AA KLQLST ALGGLLG AG FA 
J CTOS ? KG V EETAAWVL P FTSGGFLY I ALVNVL P DLLE EEDPW 


6967 


161 


633 


GPLPFKywiLDl^ASSRMETDCNPKEL.SSMSGFEEGSEl.NGFEG 
TDMKUMR LE AEAWNDV LK AVNNMF VS KSLRC ADDV AY 1 NVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHI^TPYHETVYSLLDTL\ 
SPAYREAFGKR \LLQRLEALKRDGQS 


696B 


3 


226S 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQXT 
LEO FHLSSKSS LGGPAAFG AR WAQEAY KKESAK E AGAAAV PAP V 
PAATEPPPVLHLPAIOPPPPVLPGPFFMPSDRSTERCETVLEGE 
Tl SCFWGGE KELCbPOl LNS VLRDFSLQQ1 N AVCDEUi 1 Y CSR 
CTADCLEJLKVMGILPFSAPSCGLlTKTDAERl.CNAliLYGGAyp 

ppckkelaaslalglfi,sersvrvyhe\cfgkckgl\lvpe:>ys 
spsaaciqcldncrlkypphkfwksmkalenrtchwgfndsan 
nwrayillsodytgkeeoarlgr\clddvkekfdygnkykrj?vp 
rvsseppas1rpktddtssqspapsexdkpsswlrtlagssnks 
lgcvhprqrbsafrpws pavsasexelsphlpali rdsfysyks 

FETA VAPNVALAP PAQQKWS SPPCAAAVSRAP E PLATC7C/PRX 
RKLTVDTPGA PETLAP VAA PEEDKDS5AEVEVES REEFTS S LSS 
LSSPSFTSSSSAKDUJSPGARALPSAVPDAAAPADAPSGLEAEIj 
EHLR0ALEGGLDTKF7\KEKFLHEVVKMRVK0BEKLSAALC)AKRS 
LK0ELEFLRV7iKKEXLREATEAKRNLRKEIERXRAE^KKMK£A 
NESR LRLXR ELEQARQAR V CDKGCEAGR1 >RAKY SAQ 1 EDLQV Kb 
QHAEAORJiOLFJuOLLREREAREHLEKXvVKNEbQEOLWPRARPE 
AAGSEG\AAEliEP 


6969 


1855 


' 118 


AGTMHGRLKVKTSEEQAEAKRLEREOKLKLYQSATOAVFQKR0A 

AALVKAELGFLESCLRVNPXSYGTWHHRCWLLGRLPEPNWTREL 
EbCAJRFLEVDERNFHCWDY R RFVATQAAVPPA£ELAFTDSL I TR 
NFSNYS S WHY R S CLLPQLK P Q PDSG PQGRbPED VLLKEL E L VQN 
AFFTD?Nt>OSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 
SRPLLVGSRME I LLLMVDDS PL.I VEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPOKTF RVI VTTAGDVQKECVLLKGRQEGWCRDS TTDE 
OLFR CELSVEKSTVL0SELESCKELQELEPENKWCL\LTI 1 LLM 
RALDPLLYEKETLQYFOTLK\AWDPXRATY\LDDLRSXFLLENS 
VX^IEYAEWVLHlAHKDLTVl^LEQLLiVTHbDLSHNRLRTL 

ppalaalrcledppprtWloasdnaiesldgvtnlprloelll 
cnnrb00pavlqplascprlvx,lnlqgnplc0avgileolaell 
psvssvlt 



579 



BNSDOOD <WO 0153312A1J.; 



WO 03/5331? 



PCTYU.SII0/34263 



SEQ 
ID 
NO: 


Predictec 

nucleotide, 
location 
cor re spending 
to firsl 
amino acid 
residue ot 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
arcino acid 
ceqvence 


Anno acic secment conta:ninc signa- peptide 
^.--Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a J ani ne , G=G jycine, 
K^Histidine , 1^1 soleucine . K» Lysine, 
LsLeucanc, M^Me tbionine, NrAsparagir.e, 
>- Proline, Q=Glutamine, R-Aroinine, 
f=Serine, T« Threonine, V^Vsline, 
V.' = Tryptophan, Y=Tyrooine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possifcle nucleotide insertion) 


697C 




1528 


PPLLSSPSAVGEGKVAVAAPCPGRSECARAXMAYIQLEPLNE 
GFLSRJSGLLLCRWTCRHCCOKCyESSCCOSSHDEVElLGPFPA 
t>7 P P WLKASK $ SDKDGDS VHTASE V P LT F RTN S PDGRRS S S DTS 
KFTYSLTRRJSSLESRRPSSPUDIKPIEFGVLSAKKEPIQPSV 
LPKTYNPDDVFRKFEPHLYSLDSNSLWDSLTDEEILSKYQIiGM 
UIFSTQYDlaLHKMLTVRVlEARDLPPPISHDGSRQDMAHSNPYV 
K: CLLPDQK^SKQTGVKRKTQKPVFEERYTFEI PFLEAQRRTLL 
LT WD FD KFS KHCV I G KVS VPLCEVS LV KGGHWW KALI PS SQNE 
VTLGELLI>SLNyLPSAGRlJWDVIRAKOLLOTDVSOGSDPFVKI 
CIVHGLKLVKTXKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
rTVFGHMMKSSNDFIGRIVIG\OYSSGP\SEPNHWRRMLNTHRT 
A V E Q WHS LR S RAEC DR VS PASLEVT 


6971 


3 7 


3702 


ACrrVPGSRSFKLlPRHGLVNKGRSGKLPSGVSAKLKRWKKGHS 
SDSNPAl CRKRQAARS R FFSRPSGR S DLTVTjAVXLUKELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFL5GLSDCTNVTFSKV0 
R FWESNS AAHKE I CAVLAAVTEV 1 RSQGGKETETEYFAALIRKA 
AC'HGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIOFIEKSGGSK 
EATTTLHy LTLLKDLLPCFPEGLVKS CS ETLLRVMTXSHVLVTA 
C AKQAFHSLFF1ARPC-LSTLSAELNAQI I TALYDYVPSENDLQPL 
LA WLK WE KAK 1 NLVR LQKDLG LGH L PR F FGTAVTCLLSPHSQV 
L7 AATQ? L KE 1 LKEC VA PHMAD 3 GS VTS S A SG P AQS VAKMFRAV 
EEGLTYKFHAAWSSVI,QLLCVFFEACGROAHPVMRKCLQSLCDL 
K L S PHFPHTiiAI,DOAVGAAVTSMGPE WliQAV PLE 3 DGSEETLD 
FF R S WLLPV I K DHVQETRLGFFTTY FLPLANT1.KS KAMDLAQAG 
ST V F.SKI YDTLOWQMWTLLPGFCTR PTDVA3 S FKGLARTLGMA1 
SEUPDLRVTVCOALRTLITKGCQAEAPRAEVSRFAKNFLPILFN 
LYGOPVAAGETPAPRRAVLETIRTYLTITETOLVKSLXEKASEK 
VLDPASSDFTRLGVLDLWALAPCADEAA1SKLYSTJRPYLESK 
AHGVQKKAYRVLEEVCASPOGPGALFV0SHLEDLKKTLLDSLRS 
TSS PAKRPRLKCLLHIVRKLSAEKKEFITALIPEVILCTKKVSV 
GAF. KNA FALLVEMGHAFLR FGSNOEE ALOCY LVL 3 Y PGLVG AVT 
MVrCSILALTHLLFEFKGLMGTSTVEQLLENVCLLLASRTRDVV 
KSALGF3 KVAVTVMDVAHIxAKHVOLVMEAlGKLSDl^MRRHFRMK 
LRN LFT \ KFIPK\FGI LTWG KKAVG P KEYHRVLVN I R KAEARAK 
RIHIALSQAAVEEEEEEEEEEEPAQGKGDSIEEILAPSEDEEDNE 
EEERSROKEORKLARORSRAWLKEGGGDEPLNFLDPKVAQRVIA 
TO? GPGRGRKKDHSFKVSADGRLI IF EEADGNKKEEEEGAKGED 
EEKADPMEDV1 iRNKKHQKLKHQXEAEEEELEl FPQYQAGGSGI 
HRPVAKKAMPGAEYKAKKAKGPVKKKGRPDPYAY1PLNRSKLNR 
R KI^KLOGDFKGLVKAAORGSOVGHKNR R KDRRP 


6972 


217f 


973 


?GG A3 LLPLWRRTR PREATVPRGAAORGRARS AEGR I PSSQS PS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AA^.TERARRGATHGAQLSTLGHMVLFPVMFLYSLLMKLFQRSTP 

tv t — t e» c r>r\Y T/\rm T>1 T nDPT r CUTVTD D rD PUT PC POWT 1^*27. PVR 
A J LS^Pt'i K YPi->RLi.lJf\E-i lonLsinKr nr rtt»rorynAiwxjr»o 

OH 1 YLSAR I DGNLWRPYTP I SSDDDKG FVDLV3 KVY FKDTHPK 

F PAGGKMSQY LESMQIGDTI E FRGPSGLLVYOGKGK FAIR PDXK 

S N P 1 1 RTVKSVGM I AGGTG 1 TPMLQV I RA I MKDPDDHTVCHLLF 

AiJ0TEKDILU?PELEELRN2OJSARFKLWYTLDRAPEAWDYG0G\ 

F W EEM I RDH L PP PE\ EEPLVLMCGPPPM I QYACLPNL\DHVGH 

PTERCFVF 


6973 


1 


1964 


LOFRCAKRGURAQKCGRPAPGVDAMVLCPVIGKLLHKRVVLASA 
SP^ROEILSNAGLRFEWPSKFKEKLDKASr ATPYGYAMETAKQ 
EVAN R LY QKD LRA PDW I G ADTI VTVGGL I LEK P VDKQDAY 
RMLSRFE/SGREHSVF-rcVAlVHCSSKDHOLDTRVSEFYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIOALGGMLVESVHGDFL 
NVVGFPLNHFCKQLVKLYYPPRPEDLRRSVJ05DS2PAADTFBDL 



580 



BNSDOCIO: <WO 0153312A1_I_> 



WO 01/53312 



P( JVLS00/34263 



SEQ 
ID 
NO: 


Preci ctec: 
beainnir.c 
nuclcot ice 
location 
corresponding 
tc first 
amino se'd 
residue of 
amino acid 
sequence 


Predicted end i Amino acid segment rontaininy foe ;■.«■; i peptide ; 
nucleotide ! (ArAlan:.ie ( C=Cysteine, E=-- Aspart. j c Acid, E* 
location i Glutamic Acid, F- Phenyl ale. nine . G-Glycine, , 
correr -ponding i H-lHst icUne, 3 = lfcl eucine , K=Ly?.ine # 
to first. 1 b- leucine, M=Methicnme, N = Asp^r; c: i ne , 
amino acid ( P^Proline, Q-Glut amine, R=Argin-_r.<. , 
residue of 1 S = Serine, T^Threonme, V=Vtlm<:, 
amine acid f W^Tryptophan, Y^Tyrosine, »Unkr.ov.'n ( *=Stop 
sequence ! Codcn, /^possible nucleotide deletion, 
! \=pcssible nucleotide insertion. 






1 SDVEGGGSEPTQRDAGSRrjEKAEAGEAGQATAEAfcCHRTRETbP 
j PrPTRLLELIEGFMUSKCH.l/rACKLKVFDLLKDEAPOKAADJAS 
1 KVDASACGMERLU)ICAAKGLLEKTE0GYSKTE7AWVyLASDGE 
1 YSl.HGF3M?fWIDLTWNLFTYLEFAIREGT»0HHKALGKKAEPLF 
j QDAY YOS PETRLR FMRAJ^KGMTXLT7*CQVATAFNbS R FSSACDV 
j GG CTGAJLAREIAREYPRNJCVTVFDLPDn EbAAHFQPPGPQAVQ 
! 1 K F AAGDF FRDPbPS AI 1 iYVbCRI IjK DW PDDK VH KLLS R VAES C 
1 XPGAGbbLVETbbDEEKRVAQRAbMCS LNMLVQ'l FGXERSLGEY 
S OCLL,ELKGFHOVQWHUGGVLDAIl.\VPKWP?F^OAACSb 


6974 




/17>. | RSCAAFASFASRPVbELFAPPGSHRSPPGRGVATSAOCAUSVRK I 
1 LLAARPGLGTKYQATMVYKTLFALC1LTAGWRVCE1»PTSAPLSV 
1 SLPTNI VPPTTIMTSSPONTDADTASPSNGTHNNSVLPVTASAP 
TELLPKNISIESREEEITSPQSNWEGTNTDPSPSGFSSTSGGVH 
' LT7TLEEHS LGTPEAG VAATLSQSAAE PPTb 1 SPQAPASSPSSL 
; STSPPEVFSASVTTNHSSTVTSTQPTGAPTAFF.SPTEESSSDHT 
! PTSHATAEPVPQEKTP?TTVSGKVMCELIDMLT\FPPFPG 


6975 






R PR PT\^CCKWALKLETAMETLINVF>iAKSGKi:GDKYXLSKKEL 
KEbbQTELSGFbDVKEbML*ATEAbKTFEEA* KSP3 3QCSSSRS 
SbPPAPQPPPYL*LSAVPFPlHbPbFl.LPPQAQKDVDAVDKVMK 
E LDE N GDG EV DFQE y W bVAAbTVA CNN F FW E Ni 


697 6 


12 It 


S-7C 


GCOb*VAYGTTENSPVTFWiPPEDTVEOKAESVGRlMPHTEAHl 
MJ^MEAGTbAKLNTPGEbClRGYCVWU^YWGFPOKTEEAVDODKW 
YWTGDVATMNEQGFCKIVGRSKDMIIRGGENIYPAEbEDFFHTH 
PKVOEVQWGVKDDRMGEEICACIRLKDGEETTVEEIKAFCKGK 
] SHFKIPKYIVFVTNYPI»TISGKIOKFKbREOMFRHbNLi* IKQQ 
ACPGRLA 


6977 


I29E 


1KB 


S bF 1 NTNbLSNQl RKTSFGKCSEP I S DNTEPO KG KJ jKTPDFA* R 
AN K K S K» H VNGNRTVE P F F EGTOMAV FGMGC F WG A E R K F WV LKG 
VYSTOVGFAGGYTSNPTYKEVCSEXTGHAEVVRVVYQPEHMSFE 
hbbKVFWENHDPTOGMROGNDHGTQYRSAlYPTS7iK0MEAAbSS 
K F.N y QKV bS KHGFGP 3 TTD 1 REGQT FY Y AEDYHQQY bSKH PNG Y 
CGbGGTGVSC PVG I XX 


6978 


3 


"2A 1 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAKCHbQObFKGSQ 
FATRWGFIPLVIYLGFXRGADPGMPEPTVbSbbWG 


6979 


39i7 




r;EAR\T?GEAVAAAIbSRCRHWSGPPPFPPSPPi;KKGLRG7 , EPWE 
AGPGSGATPGARAMDVRRLKVNELREEbQRRGLDTRGLKTELAE 
RLOAALEAEEPDDEREbDADDEPGRPGHINEEVrTEGGSELEGT 
A0PPPPGL0PHAEPGGYSGFDGHYAMDNITR0NOFYDTQV2X0E 
NE£GYERRPLEMECXX?AYRPEMXTEMXQGAPTSFbPPEASObKP 
DRQQFOSRKP.PYEENRGRGYFEKREDRRGRSPQPPAEEDEDDFP 
DTbVA IDTYNCDbHFKVARDRSSG YPbTl EGFA Y i'jWSGARAS YG 
VRRGRVCFEMKlNEEISVXHLPSTEPDFHWR3GWSbDSCSTQb 
GEEPFSYGYGGTGXKSTNSRFERYGDK FAENDV 3 GCFADFECGN 
DVEbS FTKNGKWMG I AFR I C/XEAXGGQAbYPKVLV XNCAVEFNF 
GORAEPYCSVLPGFTFIQHLPbSERIRGTVGPKSKAECEJLMMV 
G LP AAGKTTWA I KHAAS N P S KXYN I bG TNA1 M D KM R VMGbRRQR. 
K YAGR WDVbl QOATQCLNRblQIAAR KKRNYI LPOTxWYGSAQR 
R KMRPFEGFQRKAIVICPTDEDbXDRT J KRTDEEGKDVPDHAVb 
SMKAN FTbPDVGDFbDEVLr I EbOREEADXbVRQYNEEGRXAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPFGGNRGGFQNRGG 

GSGGGGNYRGGFNRSGGGGYSQNRWGiNT^RDNNI^SKNRGSYNRA 
PQOOPPP0OPPPPQPPPQ0PPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVS5YSPPQS FGFFPSTF0PSYSOP PYT^QGGYSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAOTYPOPSY 
NQYCO YAQOWNOYYONQGOW P PY YGN YDYGSY SGN TOGGTSTQ 


6980 


I 


420 


G TRG R KTGR VAA P STRR R TGNMQK bQ TRS PAM S bS D PGbG Y H PT 



581 



PNSDOCID: <WO_. . 0153312A1 J_- 
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SEC 1 

in 

NO: 


Fred: ctec 
bee i nn i ng 
nucleotide 
i oca tier, 
corresponding 
tc first 
amino acid 
residue of 
am:no acid 
secuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Asportic Acid, E- 
Glutamic Acid, F=Pheny lai an : ne , GsGIycir.e, 
H-Kistidine, I = Isol eucine , K- Lysine , 
L=Leucine, M=Methionine, N=Asparacine, 
P=Proline, 0=Glutam:ne, R^Arginine, 
S=5erine, T=Threonine, V* Valine, 
W=Tryptophan, Y» Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








CWTLRWPPLCSLHALHVFKCLFSSRL^TPVSPRIAMDPMCSCEA 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


105* 


PGRGFRRASLRPAFAARGVFQGGLGCAXQARTRACAALPTPHPS 
APRLLEPQGVFSLFPPPPCPWPNMILTKAQYDE1AQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLSI FSOEYCKJU KRTHAKHJJTSEAI 
ESYlTCRYLNGWKNGAAPVI/LDLANE^TYAPSLMARIill.KRFLO 
EHEETPPSK^IINSMLRDPSQIPDGVLANQVYCCIVNDCCYGPL 
VDClKHAIGHEHEVU.RDLuLF.KNLSFLDEDOLRAKGYDKTPDF 
3 LQV PVAVEGH1 I HW I ES KAS FGDECSKHAY LHDQFWS Y WNRFG 
PGLVIYWYGFIOELDCNRERGILLKACKPTNIVTLCHSIA 


6962 


153 


128b 


FPQODCSAPAAPG LAG S E PRR LRA Y RR K RORARGLKR VAW LAP P 
PS LLQG LOG W AOA PVDGTLG PEDS RAS S PM I QNS RPSLLO POD V 
GDTVETLMLHPV J KAFLCGS I S GTCSTLLFQ PLDLLKTR LQTLQ 
PSDHGSRRVGMLAVLLKWRTESLLGLKKGMSPSIVRCVPGVG1 
YFGTLYSLKQYFLNGHPPTALESVMLGVGSRSVAGVCKSPITVI 
KTRY ESGKYG Y ES I YAALR SI YHS EG HRGL FSGLTATLLRDAP P 
SGI YI/MFYNQTKN1 VPHDOVDATLI P 1 TNFSCGI FAG I LASLVT 
OPADVI K7HMQLY PLKFQK I GQAVTLI f KD YGLRG F FQGG I PRA 
LRR TLMAAMAWTVY E EMMAKKGLK S 




82 


773 


KMSFLQDPS FFTMGMWS IGAGALGAAALALLLANTDVFLS K PQK 
AALEYLEDIDLKTLEKEPRTFKAKELWKKNGAVIMAVRRPGCFL 
CREEAADLSSLKSKLDQLGVFLYAVVKEHl RTEVKDFOPYFKGE 
1 FLDEK K K FY G PORR KMMFMG F 1 RLG VKYNFFRAWNGG FS GNLE 
GEGFILGGVFWGSGKOGILLEHREKEFGDKVNLLSVLFAAKM1 
KPQTLASEKK 


6984 


1845 


1282 


GGRSAYSLPAGS1.PRVPATAAAKMASGVQVADEVCR1FYDMKVR 
KCSTPEE1KKRKKAVI FCLSADKKCI I VEEGKEILVGDVGVT1T 
DPFKMFVGMLPEKDCRYALYPASFETKESRKEELMFFLWAPELA 
PLKSKM1YASSKJDA1KKKFQG1KHECQANGPEDLNRAC1AEKLG 
GSLIVAFEGCPV 


6985 


1887 


1324 


RRTAGI YPCFPKPGRTRHALCSVVLLLJ >TGQLAFDDFCESCAMM 
WCKYAGSRRSMPI.GARI LFHGVFYAGGFAI VYYLIQKFHSRALY 
YKLAVEQLQS1IPEAQEALGPPLN2HYLKL1DRENFVDIVDAKLK 
I PVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGOQI PVFK 
LSGENGDEVKKE 


8986 


642 


1350 


Y HLY FKMGDPNSR K KQALNRLRAQLR KKK ESLADQFDFKMY I AF 
VFKEKXKKSALFEVSEVIPVMTNmEENlLKGVRDSSYSLESSL, 
ELLOKDWOLHAPR YQSMRR DVI GCT0EMDF3 LWPRNDI EK1 VC 
LLFSR WKESDE P FR PVQAXFEFHHGDY E KQFLHVLS RKDKTG I V 
VNNPN0SVFLFJDR0HLOTPKNKATIFKLCSJCLYLP0E0LTHW 
AVGTI EDHLRPYMPE 


6987 


1623 


341 


LEAAJEKAS RAFKESORQTDS K N YETEN WS POKSQRRY DM YNTAC 
FLGEI EVGLYT 1 0 1 LQLT P F FH KEN EL 5 KKHMVQFLSG KWT J P P 
DPRNECYLALS KFTSHLKNLOSDLKRCFDF FI DYMVLLKKRYTQ 
KSI AE IMLSKKVSRC FRKYTELFCH LDP CLLO/SKESQLI *QEEN C 
R XKLEALRADRFAGLLE YLN PNYKDATTMES I VNEYAFLLQONS 
KK PMTNE KQNS I LAN 1 1 LS CL XPNS Kb I OPLITLKKQLRE VLQF 
VGLSHQYPGPYFLACLLFWPENQELDODSKLIEKYVSSLNRSFR 
GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAONTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNIERVSFYLGFSIEGPPGL 


6988 


3 


689 


TOLLRRPAVFVGSMSGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPOPMPRADCl MRHLPYFCRGQWRG 
FGRGSKQLG I PTANFPEQWDNLPAD1 STGI YYGWAS VGSGDVM 
KMWSJGWNPYYKKTKKSMETHIMHTFKEDFYGEILNVArVGYL 



582 



BNSDOCID: <WO __0153312A1 J_? 



WO 01/53312 



PCT/l^0O/342r>:; 



SEQ 
sV 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 1 
nucleotide 
locat ior. 
cor respond jl ng 
to first 
amino acid 
residue oi 
amino acid 
secruenct 


Amino acid f;egraent containing signal peptide 
(A=A)anine, C«Cysteine, D^Aspartic Acid, t- 
Glutamic Acid, r>Pntny3?.Ianinc, G=G2ycine, 
H=Histidine, l-lsoleucine, K= Lysine: , 
Ij^Leucine, M=Me thicnine , N=Asparagi ne , 
P=Proline, Q-Glutam:ne, R=Arginine. 
S=Serine, T= Threonine, v^Valine. 
W= Tryptophan, Y-Ty ros me , X*= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=-.possible nucleotide, insertion) 








RPSKNFDSLESL1SAI0GDJEEAKKRLELPEHLKIKEDNFFOVS 
KSK1NNGH 




£ 


A X X C 


KDQLI YN LLKEEQTPONK ] TVVG VGA VGKACAI £ 2 LMKDLADEL 
ALVDV3 EDKLKGEMMDLQKCSLFLRTPK3 VSGKDYNVTANSKLV 

7 7TAf:apr>r>vr:vc:T>i wt .vrvu^avNi pkpt t p w\tvk v<5t> to r*Ki .i.iv 
j. x x f\\jj\n vv£*vt.oKJbJV.L» v vr> v j>i j r Ar jwix j»v va j -> x^rv«^ivi_/ij _ v 

SNPVDIL7YVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV 

HPI,SCHGW T VLGEHGDSSVPWSG/4NVAGVSLKTLHPDLGTDKDK 

E0WKEVHK0WESATEV1K1 KGYTSWA1GL$VAJD1J\ESIMKNLR 

D\niD\;cTM tun vp t vnn^ci cvd^t t /yiwr 7 cr*T \/vt/t > t tc rr 
X vnrVi JnJ AbJL» JtsJ. RUL'vrJjoVKL 1 JLAA/i»b* oUJOVA VJijJitt 

E AR L K K S ADTLWG 3 0 X ELO I 


6990 


719 


25P 


THASGKAS VVLALRTRTAV7S LLS FT PATALAVRY AS K KSGG S S 

l/xTT r»rvp o /TiT'j/'A/^* t 1/ vmr/'U v\ni * f~»MT t JiTr>Du PDUUh^ in ]/~~ \r 
lM^iA>t»lvoovjKKUt> 1 KKMouH i VliAoNl in 1 yKrlr KWnruAn V^V 

GKNKCLYAbEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 

KTFVHWPAKPEGTFKLVAML 


6991 


169 


453 


RRSSDFRNPGFLSRPVSLREN 1 HRQV1 CSTKNKRRNPKK1 AY LL 
SSLLMTNLNPNESTENOPVEAYWAFTLDQEFLTYACVEGTGCI.F 
CGRHVH 


6992 


944 


510 


RQAPGCS£l»ALRCVRQVYCGLVRAPQVQTRPLSSRFVERRGALY 
RSPMNQENPPPYPGPGPTAPYPPYPPQPMGPGPMGGPYPPPCGY 
PYOGYPQYGWQGGPgEPPKTTVYVVEDORRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6992 


1 


374 


QWCVTCPOi^NARCGPAVPPG10AYGAAPFEDLOVDFTEMSKCRG 
DR VWI KNMNVASLCPLWKGPCTWLS PPTAVKVEG3 PAWI HHS H 
VKPAARETWEARPSPDNPFRVTLKKTTSPAPVTPGS 


6994 


346 


HOC 


OW P E KD F Vf-JAA£;S I S S P WG K>IV FKA I LMVLVALI LLH S AJUAC S R 
RDFAPPG00KREAPVDVLT03 GRSVRGTLDAW1GPETMHLVSES 

SS0VLWA3 ssai svaffalsgi aaollnalglagdyiaqglkls 

PGOVQTFLLWGAGAliWYWLLSLIiLGLVLALLGRl LWGLKLV3 F 
liAGFVALMRSVPDPSTRALLLl,AI.LILYALLSRLTGSRASGAOL 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


6995 " 


144 


1346 


GS VAVGLSG IMAAQKDLWDA J VIGAGICGCFTAYHLAXHRKR I L 
LLEQFFLPHSRGSSKGOSRIJRKAYLEDFYTRMMHECYQIWACL 
EHEAGT0LKR0TGLLLLGMKEMQELKTIOANLSR0RVEHOCLSS 
EELKORFPNI RLPRGEVGLLDNSGG V I YAY KALRALQDAI RQIjG 
GIVRDGEKWEINPGLLVTVKTTSRSYOAKSbVITAGPWTNybL 
RPLGIEMPLQTLRINVCYWREMVPGSYGVSQAFPCFLWLGLCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVOIL 
SSFVRDHLPDLKPEPAV1ESCMYTNTPDE0FILDRHPKYDNIVI 
GAGFSGHGFK1AP WGKI LYELSMKLTPSYDbAP FRI SRFPSLG 
KA1IL 


6996 


542 


1942 


ETANAEAAARKSAMDWKEVTjRRRLATPNTCPNKKKSEQELKDEE 
KDLFTKYYSEWKGGRKNTNEFYKTlPRFYYRLPAENEVUjQlO.R 
EESRAVFLORKSRELLDNEELONLWFLLDKHQTPPMIGEEAMIN 
YENFLKVGEKAGAKCKQFFTAKVFAKLLHTDSYGR IS IMQPFKY 
WR KVWLHQTR 1 GLS1»YDVAGQGYLRESDI»ENY I LELl PTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKI KI ODI LACS FLDD 
LhELHDEELS KESQETNWFSA PSALR VYGQYLNLDKDHNGMLS K 
EELSRYGTATMm^TlJiRVFOECLTYIXSE^YKTYljDr^aALEN 
RKEPAALQ YI FX^LDI ENXG YLNVFSLNYFFRAI QELMKIHGQD 
PVS PODVKDEI FDMVKPKDPL KI SLQDLINSNQGDTVTTI LI Dh 
NGFWTYENREALVANDSERSADLDDT 


6997 


370 


1104 


AMELTIF1LRLAIYILTFPLYLLNFLGLWSWICKKWFPYFLVRF 
TVIYNEQMASKKRELFSNIjQEFAGPSGKTiSLLEVGCGTGANFKF 
YPPGCRVTCI DPNPNFEKFLI KS I A£NRHM?FERFWAAGENMH 



583 



BNSDOCID: «WO 0153312A1J. : 



WO 01/53312 



PCT/USOU/34263 



ID 
NO: 


Predicted 
beftint.ing 
nucleotide 
locat ion 
corresponding 
to first 
amine acid 
residue of 
amino acid 
sequence 


Predicted end 
nuci eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am^nc acid segment containing sicrjal peptide 
<A« Alanine, OCysteine, D=Aspcrtic Acid, E= 
Glutamic Acid, F»= Phenylalanine , G=Glycine, 
H=Hi stidine , I -1 soleucine, K=Lysine, 
L-l»eucinc, M=MeL h j onine , N=Asparagane J 
F-Froline, Q=Glutamine, R=Arginine, 
S=Serine, ?=7hreonine, V-Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, **=Stop 
Codon, /=poseible nucleotide deletion, 
\=pos oible nucleotide insertion) 








QVADGS VDVWCTLVLCS VKNQER I LREVCRVLR PGGAFY FMEH 
VAA E C ST WNY F WQQV LD P AW HLLFDGCNLTRE S W KA3->E RA 5 F SK 
LKbOH I OAPLS WE LVRPH 1 YGYAVK 


6998 




616 


FVSRAl.LRVRSURHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHbPVPYAPPTMESRGKSASSPKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKOQPPAAPTTAPAKKTSAXADPALlilWHSN 
LKPAPTVPSSPDATPEPXGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPLLVAGGVAVAAI am lgvaflvrkk 


6999 

i 

j 


14 


1591 


GRAGACSRRDTAMSIEIESSDV1RL1M0YLKENSLHRA1ATLQE 
ETT V S LNTVDS I E S FV AD INSGHWDTVLCA1QS LXLPDX7 LI DL 
YEOVVLELIELRELGAARSLLROTDPMIMLKOTOPERYIHLENL 
LARSY F3PREAV PIX5SSXEKRRAA1AQALAGEVSVVTFSRLMAL 
LGQAL^OOHOGLLPPGMTI DLFRGKAAVKDVE EEKFPTQ LS RH 
I KFGOKSHVECARFSPDGQYLVTGSVDGFI EVWNFTTGKI RKDb 
KYQAQDNFWMnDDAVljCMCFSRDTEM KWfK IQSG 
OCLRR FERAHSKGVTCLS FSKDSSOI LSASFDOT1 R 1HGL KSGK 
TLKE FRGHSS FVN EATKTQDGHY I 15 ASSDGTVK I WNNKTTECS 
NTFKS LGSTAGTD1 TVNSVI LLPKNPEHFWCNRSNTWI MWMQ 
GQ3 VR SFSSGKREGGDFVCCALSPRGEWI YCVGFDFVLYC FSTV 
TGK1>ERTLTVHEKDVIGIAHHPHQNL1ATYSEDGLLKLWKP 


7000 


2 


827 


GPGVVFI»ELNESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPI.»LOPALTGDVEGLCKJ FEDPENPHHEC/AMOLU.EEDI VGRn 
LLY AACMAGQS D V I RAIA K YG VNLNE KTTRGY Tl. LH CAAA WGRL 
ETLKJvLVELDVDl EAl^FREERARDVAARYSOTEGVKF LDWADA 
K L»rijKKY I AKV^>LAVri) rbKGSCj>Kl#bKEDKJ*T 1 bSAL.K-AKI<JhWL» 
ETHTEAS1NELFE0RQ0LEDIVTPIF7KMTTPC0VKSAKSVTSH 
DOKRSQDDTSN 


70 01 


2056 


844 


RRCL. 1 I AFLKGCF I FI YPlFI FETEFLSCCPG W5 AV AQS.R LI AN 
FASOVOAIFILPKDSQVGPDVKSEAAPKRALYESVFGSGE 1 CGP 

i. Or ivJvlj^ i M otrVUnv VVVoV T\iiLJtr i-if.i_il.it 'tiif-iV*yynz\0 iHOr i. 1 

VSFA3 VSPTODSRPNMSRPlilTRSPASPLNNOGl PTPAQLTKSN 
APVKIDVGGHMYTSSLATLTKYPESRIGRLFDGTEP1VLDSLKQ 
HYF1DRDG0MFRYILKFLRTSKLLIPDDFKDYTLLYSEAKYFQL 
QPKLLEMERWKQDRETGRFSRPCECLVVRVAPDLGERITLSGDK 
SLI EEVFPEIGDVMCNS VNAGWNHDSTHVI RFPLNG YQ-ILNS VQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVLRRELRRTPRVPSVI 
R1KQEPLD 


7002 


1043 


4 98 


PKPSSTRWTTS*TYTDTSSAWACRPrJ'GTCT*TAAPGPTVKWWP 
TPCSRHOSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTS AG TS WPAG R R TGTATS GTATTTS VWPGCG TR M WS TOW S S V 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGIAPSSPGLPA 
RGAEVC 


7003 


618 


61 


0GRFRAFCWORDFUJPPGMRLSALLALASKVTLPPHYRYGMSPP 
GSVAJDKRKNPPK1 RRRPVWEPI SDEDWYLFCGDTVEILEGKDA 
GK0GKWQVI R0RNWV WGGLNTH YR YI GKTMD YFGTMI PSEAP 
LLHRQVKLVDPKDRKPTEIEWRFTEAGERVRVSTRSGRI 1 PKPE 
FPRADG I VPETWI IX3PKDTSVEDAI>ERTYVPCLKTLQEEVKEA>1 
G I KETR \NTRR S 1G I SPGAEQLLPNPCPS LEG 


7004 


121 


2285 


FLLP VLTSRS LRU P AV PHAR LGGVEPAAMKBARAKTPR KPTVX.K 
G\PKFTLKT01jG/YYCRVRPLGFPDQECCISVINNTTVQLHTPE 
GYRLtfRNGDYKETQYS FKCVFGTHTTQKELFDWAN PLVNDL lH 
GKNGLLFTYG VTGSGKTHTMTGS PGEGGLLPRCLDMI FNS 1GSF 
QAKR Y VFKSNDRNSMDI QCE VDALLERQKREAK PNPKTSSS KRQ 
VDPEFADMITVOEFCKAEEVBEDSVYGVFVSYI El YNNYI YDLL 
EE VPFDP I NPNLHNLNCPVK I KNHNM YVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNIKLiVOAPLDADGDNVL 



584 



BNSDOCID: *WO 0l533l2Al_f_> 



WO 01/53332 



}>CT/US00/342<>3 



SEQ 
ID 

KO: 


Prcci ctec 
beginnin? 
nucleotide 
1 oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca t ion 
correspond! n9 
to first 
Emino acid 
residue of 
amino acid 
sequence 


Araino acid :-ecment containing signal peptide | 
(A-Alanine, OCysteine, D=A£p?.rtic Acid, E = 
Glutamic Acid, F= Phenylalanine , G^Giiycine, 
H=Histidine, 1 O soleucine, K=Lysine, 
L=Leucine, .'-^Methionine , N=Asparagine . 
P=Proline, 0-Glutamine, R=Arcinine, 
S=Serine, T=Threcnine, V^Valine, 
VJ- Tryptophan, Y-Tyrosine, X«=Unknown, *s=Stcp 
Codon, /^-possible nucleotide deletion, 
\=possible nucleotide insertion) 








0E KEQ 1 T I SOl^SLVDLKGSERTNRTRAEGNRLREAGN 1 NQSLMT 
LRTCMD VLR ENQM YGTNXWVP YRDSKLTHL FKN Y FDG EGKVR M 3 
VCVNFKA£D YEENLOVMR FAEVTQEVEVAR P VDKA I CGLTPGRK 
YRNQPRGP\3GN£pLVTDVVLQSFPPL.PRCEI LDINDEQTIiPRI, 
IEALEKRHNLROMMJDEF.VKQSNAFKALLCEFDNAVLSKENHMQ 
GKLNEKEKMTSGOKLEIERLEKKNKTLEYKIEIbEKTTTIYEED 
KRNLQOELETQNOKLOROFSDKRRLEARLOGMVTETTMKMEKEC 
ERRVAAKOLEM0NKLWVKDEKLKOuKAIV7KPKTEKPERPSRER 
DREKVTORSVSPSPVPVSYL 


700b 




876 


RNMM.YQRWRCLRLCGbQACRLHTAWSTPPRWUAERLGLFEEL 
WAAOVKRLASMAOKEPRTI KI SLPGGQKl DAVAWNTTPYQLARQ 
JSS PLAIXrAVAACVNGE?YPLERPL.ETDSDLRFLTFD£ PEGKAV 
FW HSSTHVLGAAAEQFbGAVLCRGPSTEYG F Y HDFFLG KERT 1 R 
GS ELPVI>ER I CQELTAAAR PFRRLEASRDQl^QLFKDN PFKLHL 
IEEKVTGPTATVYGCGTLVDLC0GP>OiRHTG0IGGIjKI>LSNSSS 
LWRSSG 


7006 


27 


898 


NAFGRKSTAVKMAAAAWLQVLPVILU.I^AHPSPLSFFSAGFAT 
V7LAADRSKWHIP1PSGKNYFSFGK1LFRNTTI FLKFDGEPCDLS 
LNITWYLKSADCYNEIYNFKAEEVEbYLEXLKEKRGLSGKYQTS 
SKLFQKCSELFKTQTFSGDFMHRLPLLGEK0EAKENGTNLTF3G 
DKTAMKEPLQTK0DAPYIFIVH1GISSSKESSKENSLSNLFTMT 
VEVKGPYEYLTLEDYPLMIFFMVMCIVYVI-FGVLWUAWSACYWR 
DLLRIQFWIGAV3 FLGMLEKAVFYAGFO 


70C7 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQLRKEGK'ELFKCGDyGGALAA 
YTOALGLDATPCD0AVLHRNRAACHLKLKDYDKAETEASKA1EK 
DGGDVKAL Y RRSQAL E K LGRLDQAVLDLQR CV 5 LE P KN KVFQEA 
LRNIGGOIQEKVRYMSSTDAKVEOMFQILLDPEEKGTEKKOKAS 
QNXjW lar edag ae xi fr s NGVQLLQR LLDMGETDLMLAALRTL 
VG I CSEHOSRTVATLS I bGTRRWSILGVESOAVSIiAACHLU?V 
MFDALKEGVKKGFRGKEGAIIVGEWKOW?CI>l,DVTVMEGMGLSQ 
PGQFFGDC3TCSCRLFG1RFGD3 ILL 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRW1>GRPPSGLPPGPR" 
SPPPLAGPGOKWVQKKPAELQGFHRSFKGONPFELAFSLDOPDH 
GDS DFGLQCSAR PDMPASQP1 DI PDAKKRG K K KXRGRATDS FSG 
R FEDVYQLQEDVLGEG AHAR VQTCI NL 1 TSQE Y AVK 1 3 EKQPGH 
IRSRVFREVEMbYOCOGHRNVLELIEFFEEEDRFYLVFEKMRGG 
SI L.SH3 HKRRHFNELEASVWQDVASALD7LHNKG1 AKRDLKPE 
N I Li CEH PNQVS P V K I CD FDLGSG I KLNGDCS P I S TPE L LTPCGS 
A£ YMAPEWEAFSEEAS I YDKRCDLWSLGV1 h YI LLSGY PPFVG 
RCGSDCGWDRGEACPACQNMLFES1QEGKYEFPDKDWAH1SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVOGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNE 


• 7009 


1 


626 


ARObRNSWVDDFVAAPLl PLSG/QI PTGNSLYESYYKQVDPAYTG 

LRUVACAOSGHFA'TLSNLNLSMPPPKFHDTS S PLMVTP PSAEAH 
WAVRVEEKAKFDG J FESLLPI NGLLSGDKVKPVLMNS K LPLDVL 
GRVWDI>S DID KDGHLVRDEFA VAMHLVYRALE 


7010 


79 


571 


SHTRRA Wp ETLLS PLCPLLGGGTAMSGGEQKPER Y YVG VDVGT 
GSTOAAJ^VDQSGVLLAFADQP I KN WEPOFNHHEQSS ED I WAACC 
WT KXWOG1DLN0I RGLGFDATCSLWLDKOFHPLP VNOEGDS 
HRWVIM WLDHRAVS QVNR I NETKHS VLQYVGG 


j 7011 




994 


RIOTtPNONQSOTORbLKTPPAVLQPJAPCTTFGVOTOPOPOSL 
LQAQISAAS1TPLL0TQP0PLL0QPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPO 
RKRSRERSPRRERERSPRRVRRWPRYTVOFSKFSLDCPSCDMM 
ELRRRYQNLY 1 PSDFFDAQF7VVDAFPLSRP FQLGNY CNFYVMH 



58S 



BNSDOCID: <WO_0153312A1J. : 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predict ec 
beginning 
nuc: eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
] oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containi.no sicnal peptide 
(A=Alanine, C=Cyst<=ine, D=Asuartic Acid, E= 
Glutamic Acid, F-- Phenylalanine . G^Glycine, 
H=Histidine, 3 ^1 soleucine, K=Lysine. 
L=Leucine, M=Methior.ine, NeAspar^gine . 
P=Proline, Q=Glutcimine, R=Argimne, 
S=Serine, T=Threonine, V^Valinc. 
WeTryptopnan, Y=Tyrosine, X-Unknovn, *-Stop 
Codon, / = pos£ible nucleotide deietior. , 
\=possible nucleotide insertion) 








R EVES LEKfl MAI LD PPDADHLYSAXVMLKAS PSMEDLYHKSCAL 
AEDPOELRDGFOHFARLVKFLVGMKGKDEAMAIGGHKSPSLDGP 
DPEKDPSVLI KT\A1 RCCKALTG 


7012 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAG PGTAGGSENGSEVAAQ PAG I iSG PAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPOAGPTWPGSATPME 
TGIAETPEGNRRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSOEAACFPDllSGPOCTQKVFbFlRKRTLOLWLDNPKlQL 
TFEATL001.EAPYNSDTVLVKRVHSYLERHGLINFG2YKRIKPL 
PrKKTGKVI JZGSGVSGLAAARObOSFGMDVrLLEARDRVGGRV 
AT FR KGNY VADLGAM VVTG LGGNP MAWS K Q VN M E LA K I KQKCP 
LY EANGOAVPKEKDEM VEOE FNRLLEATS Y LSHOLDFhTVLNNKP 
VSLGOALEWIQLCEKHVKDEQJEHWKKIVKTOEELKELLNKMV 
NLKEKIKELHOOYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDELAETOGKIiEEKLOELEANPPSDVTLSSKDROILDl-.'HrANLB 
FA^ATPLSTLSLKHWnoDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDJ XLNTA VRQVTCYTASGCEVIAVNTRS'TSQTFI YXCDAVLCTL 
PLGVLKQOPP A VQFVP PLPEWKTS AVQRKG FGNLNkV\^LCFDR V 
FWDPS VNLFGHVGSTTASRGELFLFWNLYKAP I LLALVAG HAAG 
IMENISDDVIVGRCLAII.KGIFGSSAVPOPKETWSRWRADPWA 
RG S YS YVAAOS SGND Y DLMAQP I T PG PS I PGAPQPI FRLFFAGE 
KTIRKYPATVHGALLSGLREAGRIADOFLGAMYTLPRQATPGVP 
AQQSPSK 


7013 




2661 


RRAGS V K RG t AR LFG P T E ROSER PLR P S AAR R P EMLSG K KAAAA 
AAAAAAAATGT EAG PGTAGGSENGS EVAAC P AGLSGP AE VGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPOAGPTVVPGSATPME 
TG 1 AETPEG \ R RTSR R KRAKVEY REMDESLANLSEDE Y Y S EEER 
NAXAEKEXKLPPPPPQAPPSEENESEPEEPSGVEGAAFQSRLPH 

DRMTS OEAAC F PDI I SGPOQTQKVFLF I RNRTLOLWLDN PKIQL 
TFEAT1»001>EAPYNSDTVLVHRYHSYLERHG1>INFG3 YKR I KPL 
PTKXTGKVI 1 1 GSGVSGLAAAROLOSFGMDVTLLEARDR VGGRV 
ATFRKGNY VADIiGAMVVTGLGGNPMAWS K0VNMELAK1 KQKCP 
LYEANGQAVPKEKI)EM\^OEFNRLLEATSYLS)40I>DFNVLNKKP 
VSLG0ALEW10LOEKHVKDEQIEHWKKIVKTCEELKELLNKMV 
NLKEKIKELHQOYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDEIAETQGKLEEKLOELEANPPSDVYLSSRDRQ I LDWH FANLE 
FANATPLSTLSLKHWDODD0FEFTGSHLTVRNGYSCVPVALAEG 
LDI KLNTAVROVR Y TASGCE V I AVNTRS TSOTF I YKCDA VLCTL 
PLGVLKOOPPAVOFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPS VNbFGHVGSTTAS RGELFLFWNLY KAP I LLALVAGEAAG 
J MENI SDDVI VGRCLAI LKGI FGSSAVPOFKETWSR WRADPWA 
RGS YS YVAAGS SGND YDLMAQP I TPG PS I PGA P OP I PRLFFAGE 
UT 1 RN Y P ATVHfl A 7 J ,SR T M E AGR T ADO FLGAM Y TLP R OATPG VP 
AQOSPSM 


7014 


3 


3950 


DFEVGDK I R I LATLEDGKLEGS LKGRTGI FP YK FV KLCPDTR V£ 
ETMAiPOEGSlJXRIPETSLDCljEirrLGVEEORHETSDHEAEEPD 
CI I SEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDS PTSDPTEWNGI SSQPQVPFHPttLQKSQY Y STVGGSHP 
HSEQY PPLLPLEARTRDYASLPPKRMYSQLKTLOKPVLPLYRG S 
SVSASRWKPRQSS PQLHNLASYTKKHHTSSVYS 1 S ERLEMKPG 
PQAOGLVMEAATHSOGDGSTDLDSKLiTQQIjI EFEKS LAGPGTEP 
DKILRHFSIMDFNSEKDIVRGSSKJjITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDONLKPAPPLWRPSRPAPLPPSAOORTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 

rdi^mysraqeelnlmleekodbssraetledlkfcesniesln' 
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SEQ Predicted 
3D ; fcecjnnino 

NO: : nucleotide 
locat ion 
: ccncsponciug 
i to first 
' amino acic 
i residue of 
i amino acid 
\ sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amino acici segment containing c:?r.al peptide 
(A'Alamne, C^Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, ^Phenylalanine , G=Glyeinc, 
B=}listidine, 1 = 1 soleucine, K=lynne, 
Ubcucane, M=Me t hi onine , N&Asp&ragme . 
P= Proline, 0=Glut amine, RsArg^n^ne , 
S=Serine, T=Threonine, V=Val*nt. 
w=Tryptop"nan, Yj= Tyrosine , X»\Jnkr.own, * ^Stop 
Codon, /=possib3e nucleotide ccietion, 
\=poesible nucleotide insertion 


; 

i 
\ 

i 

i 

i 




MELOOLREMTLLSSOSSSLVAPSGSVSAEKP^ORjVLEKRAKVIE 
ELLQTERDY 1 RDLEMC1 ERIMVPMQQAQVFN 1 DFEGLFGNMOMV 
IKVSKOLLAALElSDAVGPVFLGKRDELEGTyKIYCQNHDEAIA 
LLElYEHDEKlOKHLODSLADLXSLYNEWGCTrJYlNLGSFLlKP 
V0RVMRYPLLLMELLNSTPESKPDKVPLTNAV1AVKE1NVNTNE 
YXRRKDLVLXYRKGDEDSIjMEK3 SKLN2HS1 2 KKSNRVSSHLKH 
1/TGFAPQI KDLVi- EETEKNFkMQERijI K5r J iOLSL J.LQHIRES 
ACVKVVAAVSMtfDVCMERGilRDLEQFERVHRY 1 SDQLFTNFXER 
TERLVISPLNOLLSKFTGPHKIiVOKRFDKLLDFYNCTERAEKLK 
DKKTLF.ELOSAR-NNYEALNAQLLDELPKFHOyAOGLFTNCVHGY 
AEAHCDFVK0AbE0LKPLLSLLKVAGREGNL3AIFHEEHSRVLO 
UbQVFTFFPESLPATKKPFERKTlDRQSARKPLLGLPSYHLOSE 
FLRAS LLAR YPPEKLFQAERNFNAAQDLDVS ] .LEGDLVG V 1 KKK 
DPMGSQNRWL1DNGVTXGFVYSSFLKPTOPR*SHSDA£VGSHSS 
TESEHGSSSPRFPR0KSGSTLTFNPN\S\MAVSFTSGSC0KOPQ 
DASPPPKEWDOGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQP7ATPRSYRNFRHFEIVGYSVPGRNGQSQDLVXG 
CARTAOAPEDRSTEPDGS EAEGNQVY FAVYT * KAKN PNELS VSA 
NQKLKI LEFKDVTGNTEWWIuAEVNGKKGYVPSNy 1 RKTEYT 


7015 lB<ii 


Sli 


H0AWHE\VAAPSWRG/U-<LVQSVLRVWQVGPKVARERV1PFSSLL» 
GF0RRCVSCVAGSAFSCPRIASASRSNGOGSALDHFLGFS0PDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSR\1 J RWl J liGAPNAG 
KSTLSNOLLGRKVFPVSRKVHTTRCOALGVlTEKETOVILliDTP 
G11SPGK0KRHKLELSLLEDPWKSMESADLVVVLVDVSDKWTRN 
OLSPOLLRCLTKYSQIPSVLVMNKVDCLKOK.SVMjELTAALTEG 

wngkk:,kmroafhshpgthcpspavkdpntq£Vgnpqrigmph 
fkejfmlsalsoedvktlkqylltoaopgpvvlyhsavltsqtpe 
ei cani i rekilehlpcevpynvqoktavkeegpggelvioqkl 
LVPKESYVKIjLIGPKGHVISOIAOEAGHDIjMDI flcdvdirlsv 
KLLK 


7016 167 

I 


2513 


ILNAPKPPPFRDSVEAVAAKRDTGGGSWGTGMDVSGGETDWRST 
AFROKLVSQ I E DAMR KAG VAHS KSS KDMES H V F1>KA KTR DE Y LS 
LVARL1 IHFRD3HNKKSQASVSDPMNAL0SLTGGPAAGAAGIGM 
PPRGPGO^LGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATP0TOLOUX>VAAAAAAATARSSSSSSHRKYSSSSSSSNSKQ 
FOAOOSAMQQVOFOA \ WQCXK?QMCvQQCVvQHLi KLHHQ^W 
QIOOOOOQI^RIAOLOLOQQQOOOOOQOOOOCQALOAOPPIOQP 
PMQQPQPPPSOAliPOQLOQMHHTOHHOPPPOPOOPPVAQNQPSQ 
LPP0SQT0?LVSQA0ALPGOMLYTQPPLKFVKAPMWQ0PPVQP 
OV0O0OTAVOTA0AAOMVAPG VOVSQS SLPMLSSPS PGQQVQTP 
OSMPPPPOPSPOPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
XOSPVTARTPONFSVPSPGPLNTPVNPSSVMfJFAGSSOAEEOOY 
LDKLK0LSKY1EPLRRMINK1DKNEDRKKDLSKWKSLLDILTDP 
SKRCPLKTL0KCE3ALEKLKNDMAVPTPPPPPVPPTK0OYLCQP 
LLDAVLANIRSPVFNHSLYRTFVPAMTAIHGPPJTAPWCTRKR 
R LEDDERQS I PS VLQGEVARLDPK FLVNLDPS H CSNNGTVHL1 C 
KLDDKDLPSVPPLELSVPADYPAOSPLWIDROWOYDANPFLOSV 
HR CMT£RL»LQLPDKHS VTALLNTWAOS VHOA CLSAA 


7017 : 1 

i 


178S 


INLGNTCYMNSVI*A1>FMATDFRRQVLSLNLKGCNSLMKKL0H1« 
FA7LAHTQREAYAPRIFFEASRPPWFTPRS0CDCSEYLRFLLDR 
LHESEK1LKVOASHKPSEILECSETSL0EVASKAAVLTETPRTS 
DGEKTLIEKMFGGKLRTH1RCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEY>1SCPDCSCSPSI0DGGLM0ASVPGPSEEPVVYNPTTAAF 
3CDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKPVP0KPG 
GETTPSVTDLLNYFLAPEILTGDNOYYCENCJ^LCKAEKTMQIT 
EEPEYLILTLLKFSYDQKYHVRRKILDKVSLPLVLELPVKRITS 
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SEQ ! 
ID 
NO: 


Pieoictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted nnd 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ot 
amino acid 
sequence 


A^;nc ?.cid sesment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aipartic Acid, E= 
Glutnmic Acid, F= Phenyl al en; :ie, G=Glycirje, 
H-Hist idine, 1= I soleucintr , K- Lysine, 
L=Leucine, M=Nethionine, N-Asparagine, 
P=Pro3ine, Q=Glutan\ine, R^Arginine, 
S=--Sei:ine, T=Threonine, V»V?.ime, 
W»Tryptophan, Y=Tyrosine, >: = Unknown, *=5top 
Codon, /^possible nucleotide deletion> 
Vposrible nucleotide insertion) 








FS SIjS E SWS VDVDFT&LSENLAKKLKP S G TDE AS CTKbVPYLLS 
SVWHSGlSSESGHYYSYARNITSTrsSVOi^YHOSEAUVLASSO 
S HLLGR DS PS AV FEQDLENKEMS KE WF I.r NDS R VTFTS FQS VQK 
ITSRFPKDTAYVLLYKXQHSTNGLSGNlvPTSGLWINGDPPLOKE 
LMDA I T KDN Kb Y LQ EQE LN ARARALO AAS AS C S FR P NG FDDND P 
PGS CG P TGGGGGGG FKTVGRLVF 


7018 


484 


106t 


SLVFRGNTWSGEAGHHeSAbFNLAAYHCLFVGTERlRAPEJ 1 FQ 
P S bIGEEQAGI AETLQ Y I LDRY P KDVCEMLVONV FbTGGNTMY P 
GM KARM E K ELLEMR PFRSS FQVQLAS N P VbDAWYGAR DWALNHL 
DDNEVWlTRKEYEEKGGEyLKEHCASKlYVPIRLPKQASRSSDA 
OAS S KG S AAGGGGAGEQA 


7015 


1046 


33S 


APGGFLVTMVFPAPSPPWMIX3CCSHEV7AGPPTLCKDMSAbVAA 
RMRH 1 PLAPGSDWRDL PN 1 EVRLSDG7KAR KLRYTHHDRKNGRS 
S SG ALRG VCSCVEAG KACDPAARO FNTL 1 PWCLPHTGNRHNHWA 
GL•YGRLEWDGFFSTTVTNPEPNGKOGRV) J HPEOHR^A^SVRECAR 
SQGFPDTYRLFGNILDKHRCVGNAVPFPLAKA1GLE1KLCMLAK 
ARE5ASAK1 KEEEAAKD 


7020 


a 


2154 


FAi)SKRKSVLLDKlKNL0VALTSKO0. c LL*rA«SFVARNTFKRVR 
KGFLWRKVAVFFSNTPTRASP0LREAVLKLSDAG1TPLFLTRQE 
DR0LINAL0INNTAVGJ£ALVLPAGRDI,TDKLENVLTCHVCLDIC 
m DPSCGFGSWRPSFRDRRAAGSDVDJ DSV^FI LDSAETTTLFQF 
NEMKKYIAYLVRQLDMSPDPKASQHFAKVAVVQHAPSESVDNAS 
MPPVKVEFSLTDYGSXEKbVDFLSRGKTQLOGTRALGSAIEyTl 
ENVFE^APNPRDLKIVVLMLTGEVPEOOI.EEAORVILQAKCKGY 
FFWLGIGRKVNIKEVYTFASEPNDVFFKJ.VDKSTELNEEPbMR 
FGRLbPSFVSSENAFYLSPDIRKOCDWFOGDQPTKNLVKFGHKQ 
VNVPNN\TTSSPTSNP\n:TTKPVTTTKPV'TTTTKPVTTTTKPVTI 
1 NOPS VKPAAAK PA PAKPVAAKP VATK T AT VR PPVAVK PATAAK 
PVAAKPAAVRPPAAAAAKPVATKPLVPRPCAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKLHWERPEPFGPYFYDLTVTSA2HIDQS 
LVbKQNbTVTDRVIGGLlAGOTYHVAWCYLRSOVRATYHGSFS 
TKKSQPPPPQPARSASSST1NLMVSTEFLALTETDICKLPKDEG 
TCRDFI bKWYYDPNTKSCARFVi YGGCGGN ENK PGSQKECEK VCA 
PVbAKPGVlSVMGT 


7021 


2 


336 


VNAVS FFPNGYAFATGSDDATCRLFDLRAD0ELLLYSHDN1 I CG 
I TS V A FSKS GR bLLAG YDDFN CNVVJ D7 1 ,K GDRAGVbAGHDN R VS 
CbGVTDDGMAVATG S WDS FLR 1 WN 


7022 


2 


856 


VY 1 GS FWSHPLLI PDKRKLFE AEEQDLFK DIOSLPRKAALRKLN 
DL3 KRARLAKVHAY 1 1 SSLKKEMPSVFGKDNKKlCELVNNbASiy 
GR I EREHO I S PGDFPNLKRMQDQLQAODFSKFQPLKSKbbEWD 
DMLAHDI A0LMVLVRQEESORP 1 OMV KGG AFEGTLHGPFGHGYG 
EGAGEGIDDAEWVVARDKPMYDEIFYTLSPVDGKITGANAKKEM 
VRS KLPNS VI/3K1 WKLADI DKDGMLDDDE FAbANHbl KVKLEGH 
FT ,PN FT .P AH I »T .P PS KjR KV AE 


7023 


2 


746 


AMVFGG WPY VPQYRD I RRTQNADGFSTY VCLVLLVANI LRIbF 
WFGRRFESPLXWQSAlMIbTMLLMLKLCTEVRVANELNARRRSF 
TAADS KDEEVKVAPRRS FLDFDPHH FWQWS S F S DYVQCVIAFTG 
VAGYlTYLSIDSALFVETbGFLAVLTEAMLGVPObyRNHRHQST 
EGMS 1 KM-/bMWTSGDAFKTAYFLLKGAPLCFS VCGLLQVbVDLA 
I LGQAY AFARH PQK PAPHAVH PTGTKAL 


7024 


1207 


19C 


RTG VTG WAQ VWMFGGGGVLS SGEQLQK P V KP ERGLGPSl!)GWLV 
SSRRGSPGTVLGLPFWLLTPVLVSRS I RSMLLbTRSPTAWHRLS 
Ol^PPVLPGTl^GQAl^LRSWLLSROGP^TGGOGOPOGPGbRT 
RLL1TGLFGAGLGGAMLALRAEKERL000KRTEALRQAAVGQGD 
FHLLDHRGRAJKKADFRGQWVLM Y FGFTH CPDI CPDE1EKLVQV 
VROLEAE PGLPPVQPVF 3 TVDPERDDVEAI4ARYV0DFHPRLLGb 
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SEQ 
ID 
NO: 


PrtMlictec 
bee: rming 
nuc] eotide 
location 
cor respond i ng 
tc first 
an-?. no acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
emino acid j 
sequence 


An.-. :>o acid segment containing e-ignal peptide 

/Marine, C=Cysteine, D=rAspartic Acid, E= 
Glcmmic Acid, F=Fhenylalan: ne, G=Glycine, 
H=Histidine, I-l scjeucine , K=Lysine, 
h~ L»euci ne, ^Methionine , N=Asparoginc , 
PsFrcline, 0=Glutair.ine , R=Arginine, 
SiS-erine, T-Threonine, V=Vahne, 

ryptophan, Y=Tyrosine, X=Unknown, *^Stop 
Cooon, /=possible nucleotide deletion, 
Vpcssible nucleotide insertion) 








TGr.TKOVAQASHSYRVyYNAGPKDEDODY 1 VDKS3A1 YLLNPDG 
tor i l>Y loKoKSAtQi bVb VRKHrlAArRSVLi: 


7025 


232 


832 


ERNS P1GNHENL* K\HSLDCLCFHGDWEGWTOFOTLGDNOEECF 
KOV 1 KTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAF1SG 
SDHTOJiOLIHTSEKFCGDKECGNTFLPDSEVIQYOTVHTVKKTy 
ECK^CGKSFSLRSSLTGHKKiHTGEKPFKCKDCGKAFKFHSQLS 
VKRk I HTGEKSYECKECGKAFSCG 


7026 


32e 


1146 


NPKrSIGDIKDIKKAAXSMbDPAHKSHFHPVTPSLVFLCFIFDG 
LHC/MiLSVGVSKRSNTWGNENEERGTPYASRFKDMFNFIALEK 
S S V LRHCCDLL I GVAAGS SDK I CTSSLQVQRR FKAMKAS 1GRLS 
HGKSADLLISCNAFSAlGWISSRPWVGEbMFTFLPGDFESPLHK 
LR K F S * L PRKHR * OP INAVRM FLDQCMDGS 1 ALRAI VSEI PVFE 
EKKJ^G* KGIGE1 r * VWGCTLPPHYWGAVTTNVPKLSNSGKLLG 
OEEOPHIFG 


7027 - 


13 


954 


GRK LQCWRPEDAEDGAEGGGKRGEAG WEGG Y ?E IVKEN JCbFEH 
YYC'E L Kl V PEGEWGQFMDAbREPLP ATLR 1 TG Y KSHAXE I LHCL 
KNKYFKELEDLEMDGOKVEVPOPLSWYPEELAWHTNLSRKIU?K 
SPKI..EKFHQFLVSETESGN] FRQEAVSMJ PPLLLNVRTHHKILD 
MCJ-.APGSKTTOLI EMUiADMNVPF PEGFV1 ANDVDNKRCYLLVH 
QAFJ^ I SS PC1MWNHDASS 1 PRL0 1 DVDGRKE 1 LFYDR I LCDVP 
CS CPG TMR KN 1 DVWK KWTTLNSLQI ,HGL0LR ? ATRG AEQL 


702B 


189 


go e 


SKF FPEPEPGTMVEKGSDSSSEKGGVPGTPSTOSLGSKNK1 RNS 
KKWCSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
EI IjLOGRLYLS ENWI CFYSN I FRWETT1 S 1 OLKEVTCLK KEKTA 
KL1PNA1Q 


7029 


1343 


40 


VLEFrCTEAKOATGrSSKLRHGTGQEKGREGPRCPSGLA.OL.RLMG 

/pcp}1agretgprasapjpg5*ghgwhk*rkpgrger£egpsal 
sph?> > sll^jm00apthvgpgkigs0rprs5wpe0vgvgs0lsre 
rkka-rslfgaaasertemtxersp/rpccgydsswftopgkk 
trkf.nsr rntmvs rgggclly plos 1mpe*0lr* gakas p ptog 
r*gkggprspltkasgtth1ptpffgsip/rptrdsgpgtdns\ 
aa pgo krg hr ea * ogpepv/ kgrvtthlogpag ♦ tkplgs \ rnw 
vpgfaegeccegaglegkp*plkgcrstltfspqlsipmvgkkp 
pegttasffp\rschse*rkpppscphapalslphplplplppli 

PLFLPGAGT*HSARSGRPGCSErGSLCWCHHCPPHCFKCSPGG 
T 


7 03 0 


2 


521 


FVCFSAPGSGQGGKRRVtfMELSAVGERVr AAEALLKRR1RKGRM 
EYLVKWKGVJSQKYSTWEPEEWILDARLLAAFEEREREMELYGPK 
KRGPKPKTFLLKAOAKAKAKTYEFRSDSARGJRIPYPCRSPQPL 
ASTSRAREGLRN\RVCPRORAAPAFAAP\PRRGPSGPGPRPG*G 
PGLKF PGPGGPSKKGFVPAS EQHQKQQKLPRRGPSGPGPRPG 


7 031 


96 0 


59 


HCSVKC-AEWPRKPFAQICPvLTbKPHboSrKb rV»Pl» 
/CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 
PRRJL TE P PALTVS F VGRAAPS GAL* PSGRACSACSHRLA PEAAL 
SAAAPRPSLGSGONASGLPAASLPPODSSOPHKTVPSPARSVPP 
bGAOARAAPPRLWCPRALVSG * EAS PEAVSVAAGPPVPGPTPST 
SGSTASHS RRGC* S PR*TPAP PRRDHGRS AAFEVLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSb 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR»RNLHLAPL*OSPLRK 
SR01GTSSLPFGRSAGERPRFAATFC1>SRGGSSPVFL*PSSSSL 
EPKKKRQFGRLHSLFWKSWOIOMNSFLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLOMASELAPPbPTPAPLASSLPPPPGPPPLLPV 
PLA* I,SRSGILVPPNSGFSLSC\PLGDH* GSSGEVRGSCGSPPP 
HRCfcVU>PPP*LLLPPR 


7033 | 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*AP?GPSSA 
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SEC 
ID 
NO: 


Predicted 
bee inning 
nucleotide 
lecat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot i<k- 
loca tior. 
corresponding 
to first 
amino acid 
residue at 
amino acid 
sequence 


Amino acid segment containing sagna- peptide 
(A=Alanine, C=Cysccme, Dispart ic Acid, E= 
Glutarr.ic Acid, F=?heny)alamne , G=Giycine, ' 
H=Histidine, I- I r.cleucinc , K=Lysine, 
Ubeucine, MsMethionme, N^Asparagir.e, 
P-Proline, 0>g;1u t amine , R=Arcinine, 
S=Serine, T=Threomne, V=Valine, 
W= Tryptophan, Y-- Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nuc)eotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSKALGRCTSSVGPGSRWLTRTSSP 
GCATRlTrtRTMRMEPRPLRSRMGESAPGJPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL * SRRTAEWCWPPSCSCCKGWC * SWSA 
WDWRRPPLQVSPAPSSSCRA^CCWCLESIT*SS5TAKSRATGA5 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALODPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPOHHGAPGPDGSAPDPAHVRERV 
KAM F Y HAY DS Y LE N A F P F DE LR PLT CDGH DTV3GS FS LTL I DALD 
TLIiXTLFYFOILGNVSEFQRWEVLQDSVDFDIDVU/vSVFETNl 
RWGGLLSAKLLSKKAGVEVEAGW PCSG P LLRMAEEAARKLLPA 
FOTPTGMPYGTVNLLHGVNPGETPVTCrAGIGTFlVEFATLSSl* 
TGD P VFEDVARVALMRLW ESRSDIG LVGNH I DVLTC K WVAQDAG 
IGAGVDS YFEYLVKGA I LLC-DKKLMAM FLEYNKAIWYTRFDDW 
YLWVQMYKGTVSMPVF0SLEAYWPGL0SL1GDIDNAMRTFLNYY 
TVWKQFGGLPEFYNI FQGYTVEKREGYPLRPELIESAMYLYRAT 
GDPTLbELGRDAVESI EK I S KVECGFATI KDLRDHKLDNRMESF 
FUIETVKYLYLLFDPTNFIHNNGSTFDAVITPYGEC1LGAGGYI 
KNTEAHPIDPAALHCCORLK^EOWEVEDLMREFYSLKRSRSKFO 
KNTVS5GPWEPPARPGTLFSPENHDQARERKPAX0KVPLLSCPS 
QPFTSKLALLGQVFI ,DSS * P^DNFFl FI FLRLNYNKLL1A1 1 KK 
K 


7 035 


92 




EDTSSWPFRLLlPLGbLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFY HAY DS Y LENAFP FDE LRPLTCDGHDTWGS FSLTLI DALD 
tll\tlfyfqi lgn vse for WEVLQDS VDFDI DVNAS VFETN I 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGI gtfi vefatlssl 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAODAG 
IGAGVDS YFEYLVKGA I LLODKKLMAM FLEYNKA IRNYTR FDDW- 
YLWVQM YKGTVSM PVFOS LEAYWPGLQSLI GDI DNANRTFLNY Y 
TVWKOFGGLPEFYNI PGGYTVEKREGY P^RPELIESAMYLYRAT 
GDPTLLELGRDAVES1 EK1S KVECC-FATI KDLRDHKLDNRMESF 
FLAE1-VK Y LYLLFDPTNF I KNNGSTFDAV I TPYGECI LGAGGYI 
FNTEAHPIDPAALHCCORLKEEQWEVEDLMREPYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFS PENHDQAR ERKPAXQXVPLLSCPS 
QPFTSKLALLGOVFLDSS* PLDNFFIFI FLRLNYNKLLLAI I KK 
K 


703 6 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAKbRGPGRN*LPGEGPSIPT 
RNW*ERKAGCSOPC/PACOHKGRPPGVSPLPRDPHPTTLRPLPP 
PPPP PPP PPRRPPRNRR PG 


7037 


442 


76: 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPS2PT 
RNW*ERXAGCSQPC/ PAQOHHGRFPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7 03 8 


155 


831 


GAGAASDMSSGLRAADFPRWKRH1 SEQLRRR3RLQR0AFEEI IL 
rivriKi ,t .vic<inT ,Hc\n aoki .ntr khdvpnbhei SPGHDGTWNDNO 
LQEMAQLRI KHQEELTELHKKRGELAQ\RVI DLNNQMQRKDREM 
GWEAKIAECLQTTSDLETECLDLRTKLCDLERANOTLKDEYDA 
LQI TFTALEGKLRKTTEENOELVTR WMAEKAQEANR LNAR E* KR 
LQEAASPAAERACRS SKGTSTSRTG 


7039 


155 


691 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLCROAFEEUL 
QYNKXLEKSDI^SVl^OKI^AEKHDVPNRHEISPGHDGTWNDNQ 
LQEKAOLR I KH0E EL ?E LHK KRGE LAQ \ R V 1 DLNNQMQR KDREM 
QMNEAKI AECLQTI S DLETECLDLRTKLCDLERANQTLKPE YDA 
LQI TFTALEGKLRKTTEENOELVTR WMAE KAQEANRLKARE * KR 
LQEAAS PAAERACRS S KG TSTSRTG 


7 04 0 


34 


769 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYES VKRDS EATGSA5 S A0DSTS ENSSS VGGRCRSLKTPKKRSN 
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1 SLO 
\ W 

NO: 


Pr edictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Medicted en;, 
r.ucleot ide 
; oca t ion 
corrcspcndirx 
to first 
amino acid 
residue of 
amino acid 
sequence 


Arr.ino acid segment contaimno signal p<*V'*-^oe 1 
(A c Aiar.ine ; C=Cysteinc, D=Aspartic Aci c, E= 
Glutamic Acid, F=Phenylalanin*-, G-Glyr:ne, 
H=Histidine, 1 = 1 soleucine , X-bysine, 
L-beucine, M=Me thionine, N=Asparagi ne , 
P= Proline, Q=Glutan>ine, R=Arginine, 

o- i 1 11 c: ; 1 - 1 IK c Ul l a. > \XZ , V-vaii^c, 

W=Tryptophan, Y=Tyrosine, X=Unknown, -=^Stop 
Codor., /-possible nucleotide deletion, 

IIUL1CUV.1 Ut Alloc 1 U 1 L, 1 i 








PG SQR R R LI PALSLDTSS PVR KPPNSTGVR K'VDG PbR SS FRGLG 
EPFE1 KVYEI DDVERbQRRRGGASKEAMCFNAXLKI LEHRQQR I 
AEVRAKYEWLMKF.bEATKOVLMLDPNKWLSSFDLEOVKFLDSLE 
YL EALECVTER LES R VNFCKAHLMMI TCFD J T 


7041 


■ j 567 


sgrvamgrrrapaggslgralmrhqtqrsrshrktdsklhtsel 
ndgydwgrlnl0svtsosslddflataelagtepvaekunikfv 
pasartgllsfeesqrikklheenkqflc1prrpnwnqicttpee 
lkoaekdnflewrrolWrleeeokliltpferj^ldfwrclwrv 
iersd1w01vda 


7042 


7 , 34S 


_ PI HMAAAALRADI \ 1 SPLFPHIQGYbbbS7*£KG\ATS LKTKGAb 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENl.DEFKRLAIvNSASN 
DDbLKAEVAI SDYGDKbTbEbREKY 


7043 




2170 


ARGMAARDSDSEEDLVSYGTGl>EPbEEGERPKKPIPLODQTVRD 
E KGRYKR FHGAF5GG FSAGYFNTVGSKEGW I PSTFVS S R0NRAD 
KSVLGPEDFMDEEDLSEFGIAPKAI VTTDDFASKTKDR2 REKAR 
QLAAATAP 1 PGATU >DDLI TPA KLSVGFELLR KMGW KEGCGVGP 
RVKRRFRRQKPDPGVKI YGCALPPGSSEGSEGEDDDYLFDNVTF 
AP KDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEH FN LFSGG 
SERACDLGEIGUNKGRKIjGISGQAFGVGAI/EESDDDIYATETLS 
KYDTVLKDEL'PGDGLYGWTAPRQYKNQKESEKDLRYVGK I LDGF 
SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLOV 
bS ES AG KAT PDPGTH S KHQLNAS KRAEbbG ET ? 1 OGS ATS VbEF 
bS OKDKER I K EMKOATDLKAAQbKAR SLA0WA0SSRA0PS PAAA 
AG H CS WNMA LGGGTATLKASNFKP FAKDPE K0 KR YDEFb VKM KQ 
GOKDAbERCbDPSKTEWERGRERDEFARAALLYASSHST^SSRF 
THAKEEDDSDOVEVFRDOENDVGDKQSAVKMKMFGKLTRDTFTW 
HPDKbLFQ/RbVGbPRVKRDKYSVFNFLTLPETASLFTTO.ASSE 
KVSQKRGPDKSRKPSRVmTSKJiEKKEDSlSEFbRLARSKAEPPK 
O0SSPLVNKEEEHAPELSAN 


7044 


276 


734 


EVYLTDEFAKGRKVADbYELVQYAGNIl PRLYLblTVGV\ r YVKS 
FPOSKKDI bKDLVEMCRGVQHPLRGbFbRNYbbOCTRNI bPDEG 
EPTDEETTGD1 SDSMDFVLbNFAEMNKLWVRMQHQGHSR D3EKR 
j EREFQELRILVGTNLVRLSQV 


7046 


3 


513 ! L.G F KME AbS RAGQEMS LAALKQHD P Y I TS 1 ADLTGQV ALY T FC P 
I KANOWEKTDlEGTLFVYRRSASPYHGFTIWPLNMHNLVtKVNK 
} DLEFQbHEPFLLYRNASLSIYSIWFYDKNDCHRIAKb^DVVEE 
; ETR RS00A/RSGQTESQPGQWLQR PQAHRH PGD AEQSQG 


7046 


3 


S13 


} bG F KM E AbS RAGQEMS LAALKQHD P Y I TS I APLTGQVA1, Y TFCP 
| KAN0WEKTD1 EGTLF VY R RS AS P YHGFTI VNK LNMHNLV Z ? VNK 
| DbEFObHEPFLLYPJ^SbSIYSIWFYT>KNDCHRIAKbK^Jj^EE 
1 ETRnSOQA/RSGOTESQPGOWbORPQAHRHPGDAEQSOG 


704 7 


103 


466 


| QMK 3 E X CG WS EGLTS 1 KGNCHNFYTAI SKDVT Y KEbKNLLN S KK 
1 Kb 1 DVRE 1 W E 1 LEY OK 1 PESINVPbDEVGEAbQMNPRDF KEKY 
I NEV KPS KSDS / 1 VFS YLAGVRSKKAbDTAl EbGFHS YYEK 


7048 


92 


627 


i FFCbTbLSSWDYRHKATRRVISSPVFTMEDSGKTFSSEEEEANY 
WWDbAKTYKORAENTCEEbREPQEGSREYE AEbETObOO 1 ETRN 
RDLLSENNRLRMELETI KEKFEVQHSEGYRQ1 SALEDDLA0TKA 
I KDOL0KYIRELE0ANDDbERAKRATDHGLSKTFE\0RbN\QAI 

| EKKW 


7049 


393 


93B 


j KRTGSASYGGPPPGbGGPATXASVAGRCSSVGKl PARRCYEDEL 
| \^VFEAVGR3YELRbfWDFDGKNT?GYAFVMYCHKHEAKRAVREb 
NNYEJ RPGRLLGVCCSVDNCRLFIGGI PKMKKREEILEE: AKVT 
EGVbDVI VYASAADKMKNRGLRbRGVREPPRGCKWLGRKb 1 AWX 
ASSLWG 


7 0S0 


3 93 


938 


KRTGSAS YGGPPPGLGGPATXAS VAGRCSS VGK J PARRCY EOEb 
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SZQ 
3D 
NO: 


F j (-cictec 
bee: nnmc 
nucleotide 
lc cation 
cc : responding 
tc first 
5TT.;no acic 
rtridue of 
an,: no acic 
£ecv ; ence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am) no acid segment containing sicnal peptide 
(A=A.;anine, C=Cysteine, E-Aspartic Acid, £= 
Glut antic Acid, F = Phenylalanine , G-Glycir.e, 
H=-His tidine, 3 - 1 soleucine , K=Lysine, 
L=Leucine, M=Ket hionine , N»Asparaginx 
P=Proline, Q^G) utamine , R=Arginine, 
SsSenne, T=Threonine, V^Valine, 
W*Tryptophan, Y- Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








VPVFEAVGR 5 YELKi^MDFrcKJ^RGVAFVMyCHKHEAKitAVREL 
IWYEIHPGRLLGVCC^VDMCRLKIGGIPKMKKRFBILEEIAKVT 
EGVLDV 3 VYASAADKKXNRGLR LRG VR E P PRGCH WLGRKLI AWX 
ASSI.WG 


7051 




816 


KKMNIAEI CDNAKKGREYALLGNYDSSMVYYQGVMQOI 0RHCQS 
VRDPA3 KGKWQQVRCELL.EEYEQVXS IVGTLE5FK3DKPPDFPV 
SCODF.PFRDPAVWPPPVPAEHRAPPQIRR/RQSKRKTSEERNGR 
SRS PC-TCR PSTN P3SKS EKPSTS RDKDYRARGRDDKGR KNMQDG 
ASDGr.MFKFDGAGYDKDjVEALERDIVSRNPSIHWDDIADLEEA 
KKLLR H AG VLPMWK 


7052 


4 6'/ 


715 


SCPGRG KMS KLLNPEFMTSRDY YFDSYAH FG I H EEMLKDE VRTL 
TYRN £ KYHN KHVFKDKWLDVGS G TG J LSMFAAR 0GPRR 


7053 


467 


715 


SCPGRGKMSKLLNPEKMTSRDYYFDSYAKFGIHEEKLKDEVRTL 
T Y R N S M Y HN KHV FKDK WLDVGSGTG 1 bSMF AAR QGPR R 


7054 




1036 


GTS0R S R ETDARRRSAGAEPTARL PWPAALEEWPSCPCE PLGFG 
RRCFWDA>'EYDEKliARFRQAHLNPFNKCSGPRC?HEOGPGEEVPD 
VTPEEAl»FELPPGEPEFRCPERVMDLGI»SEDHF5RPVGLFliASD 
VOO^ROAl EECXQVI LELPEQSEXOKDAWRbl HLRLKLOELKD 
PNEDEP.V J KVLLEKRFV KEKSKSVKQTCDKCNT1 1 WGL3QTWYT 
CTGCYYR C KS K CLNU S KPCVSS KVS HQAEYELN I CP ETG hVSQ 
DYKCAECRAP I /CS/DGWPSEARQCDYTGOYYCSHCHViNDLAV 
IFARVVnNWDFKPRKV^RCSMRYlJtl.WSRPVLRi-RElN 


7055 


2 


527 


DSRRVSWRSWLANE/>:GKHLCLF3V;i J SMKVLLFWKTFI,LYNQGP 
EYHYLKOMtiG/ALCLf RASAS V1iN1/NC5LILLPMCRTLLiAYIjRG 
SOKVPSRRTRRLLDKSRTFHITCGATI CI FSGVHVAAHLVNALN 
FSVNY S ^DFVELNAAH YRtiEDPRKLLFTTVPGLTGVCMEWLFl, 
K 


7056 


2 


527 


DSRRV5 KRS WliANE / KG KHLCLF 1 WLS I4NVLLF WKTFLL YNQG P 
EYHYL110MLG/ALGL5RASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGAT1 CI FSGVHVAAHLVNALN 
FSVNYSEDFVELNAARYRDEDPRKLLF'JTVPGLTGVCMEWI>rL 
M 


7057 


136f 


431 


GIYXHWEKIFRPTCIGDROENDKF.KLNLSNHRDOELLHASCOA 
SGEVPSOASLRGFFTEDEPGCFGEGENLPEAIiQN 3 QDEGTGEOL 
SPQER3 SEKOLGQHLPNPHSGEMS'I'MWIiEEKRETSQKGQPRAPM 
AQKJUPTCRECGKTFYR^SQLI FHQRTHTGETYFQCTI CKKAFbR 
SSDFVKJiQRTHTGEKPCXCDYCGKGFSDFSGbRHKEKIHTGEKP 
YKCP3CEKSFI0RSNFNRKQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHORSHLGKKPFQ*?VTKl>SFPISIS0PSHKNTCLH0EELCLR 
GYPC 


7058 


1 


469 


FSGFGAVPPAEOCRMSrLRITEAFLYMDYLCFRALCCKGPPPAR 
PEYDL VC 3 GLTGSG KTS LLSKLCSES PpNWSTTG FS I KAVP FQ 

ARfl * S CTOLLQHPQLCTLPPLI LA 


7053 


a 


1178 


WPAFPROPAAV^DALU?TGPRRARGCLGAAGPTSSGRAARTPA 
APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 
CYLS FPDSHSGCLGPTQFSFRMROCGGORS PWHADDR HYTJSRAP 
VALORE PAH YFGYVYFROVKDSS VXRGY FQXSLVLVS RLPFVRL 
FQALLS L 3 A PE Y FDKLAPCLBAVCSE1 DQW PAPAPGOTLNLPVM 
GWVOVR I PSRVDKSEE S PPKQFDQENLLPAP WLAS VHELDLF 
R CFR P VLTHMQT3.WELMLLGEPLLVLAPS PDVS SE WVLALTSCX 
OPLRFCCDFRPYFTIHDSEFKEFTTRTOAPPNVVl^VTNPFFIK 
TLOHWPHILRVGEPKMSGDbPKCVKLKKPFKV^RPWDTICP 


7060 


90 


1670 


SVWLP PSbWPWEEAMDSTKSEPLKGS FEAEDGN 1 EY KKLVWPSQ 
YRFEHLVTOMKWRLOEGRGEAVYOIGVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predircec 
begi nninc 
r.ucl eotide 
iocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
j oca t ion 
corresponding 
to first 
amino acid 
residue ot 
amino acid 
sequence 


jjnino «cia ee£nient containing signal pept ide 
(A*A3 anine, O Cysteine, D-A&partic Acid, L- 
Glutarmc Acid, F^Phenylalani ne , G-Glycint. 
H=Histidine, J = 1 soleuciae, K=://sine, 
L-Leucme, K«Me thionine, KsAsparagine . 
P=Prcline. C-Gjutamine, R=Arcinint, 
S=Serine, T=Threonine, V~Val:ne, 
W=Tryptophan, V^Tyrosine, X^Urihnovm, *-Stcp 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insert Jen) 






. .. . ... -j 


b KTLHRKAE XVGAD I T VLR ER EVDYDS D! • PRK1TEV bVRKV P DN 
0QFLD1>FVAVLGKVDSGJCSTLLGVLT0G£I^GRGRARLNLFRH 
bHEIOFGKTSSlSKLILGFNSKGEVHGlNCTOWGOTLRMGW*** 
RT* DGGRVWRLFEI V ♦ MNALRGL* TSSA? LRKSKGKQl,N* I KNG 
VKIKROGHPGNGLGPGNSEGVGRAGRRH*CPWALGOWNYSDSR 
TAEE1CESSSKMITFI DbAGHHKYbHTT3 FGbTSYCPDCAbLLV 
SANTGI AGTTREHUG1ALALKVPFF1WS K1 DLCAKTTVER7VR 
0 bE R Vb KQ PG CHKV F MLVTSEDD A VT AA0 0 F AOS PNVTP I F TL S 
SVSGESLDLLKVFLNILPPbTNSKEQEELMOCbTEFOVDEIYTV 
FEVGTWGGTLSR-* I DLbATLPTQPSPI YSKTSWPKGGDPG j 


7061 


.164 


710 


A-RMPSPLGPPCLPVI-lbPETTLEEPETARLKFRGFCYOEVAGPRE 
AL/^LRELCCOWLQF EAHSKEQMbEMLVI. EQFLGTbP PE I QAWV 
RGQR PGS PEEAAAbVEGLOHDP * ARMPS P LC P PCbPVMDPEITL 
EEPETARLRFRGFCYQEVAGPREALARLKELCCOWLOPEAHSKE 
OMLSMLVLEOFLGTLPFEIOAWVRGORPGSPEEAAALVEGbCHD 
PGQbLG 


7062 


71 


744 


AKAGTNLERLHWLSYFFCIPKHKLKSSOKDKVRQFMACTOAGER 
TAI YCLTONEWRLDEATDSFFOWPDSLKI- ESMUNAVDKKKbERL 
yGRYKDPODENKlGVDGIQOFCDDLSLDPASlSVLVlAWKFRAA 
TOCEFSRKEFLDGMTELGCDSMEKLKALLI'RLEOELKDTAKFKD 
FYOFTFTFAKNPGOKGLDL*MAGAYWKLVLSGRFKFLYLWNTFL 
MEHH 


7063 


2 


562 


LRTVPDLPGHRKRAMRTGORR*PEbPPDKNSLE0AEDLKAFERR 
LTEYIHCLOPATGRWRMl.I.IWSVCTATCAWNVJLIDPETQKVEF 
FTSLWNH PFFTI S C 1 TLI GLF FAG I HKRW APS 1 1 AAR CRTV1 A 
EYNMS CDDTG KLI LKPR PHVQ * OS S L I V^bK I AFXR 1 S DTAKS 
HKGFbLRLDK 


7064 


300 


684 


RDTGSDP-SSTRRbCS j CCTGH* PAEP1A* PHPSRG7 CPPASSAS 
SRRTGCWTCPPESGHAQARRSRRASASRKGARGAVRSAVAARGC 
$ SRAGRWbETPGRRKGP PACAAAAGRbRC FAP * AAP PTAS V PAR 
CRCPAARTGAPAAATWLRRRLSGLRAPA1C-RRRSPGPSPXSAAP 
PLLTPLGAGRAGGSRANS 


706S 


1 


bSh 


ATTTHSAHRSGRGAAAEAAASAAGGRQKGFDRKAWEGRRTTPGG 
RS0SEPKAPPP0KRSEAAFASMAHSPVAVCVPGM0NN3ADPEEL 
FTKLER3GKGSFGEVFKGIDNRT00WAJ K5 JDbEEAEDEIEDI 
Q0EITVLS0CDSSYVTKYyGSYbKGSJCbW:3MEYbGGGSAl>DbL 
RAGPFDEFO 


7066 


356 


676 


FGPQRGPWRAREGGHPLDPADHFRAPASLkSNVRAATKMQlCDT 
YNOKHSLFNAMNR Fl GAVNNWDOTVMVPr L bRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KEN 1 TMATE I GSP PR FFKM PRFQHO APR0L F Y KJIPDFAQQOAMQ 
ObTFDGKRJiRKAVNRKTIDYNPSVI JCYbFN'RI WQRDtfRDMRAI 0 

T PEGRRLVTG AS SGEFTLWNGLTFN FET 1 1£AH DS PVRAMTWSH 
NDMWMLTADHGGYVKYW0SNMNNVKMF0A>:KEAI REARFIHNI P 
FS WPI VMVKLFSXC1 LGAEMHGLCQFbGNFLHPI NTI FFFVFT 
KSPFCWAPF 


7068 


222 


816 


DTM KE YVLLbF LALC S AKP FFS PSH I Ab KNWlb KDMEDU'DDDDD 
DDDDDDDDDDEDNSbFPTREPRSHFFPFDbFPMCPFGCQCYSRV 
WCS DLGLTS V PTNIP FDTRMLDLQNN KIKEJ K E NDFKGLTS b Y 
Gbl LNNNKLTKI HPKAFXTTKXLRRLYLSHNQbS EI PIA'LPKSL 
AELR3HENKVKKI0KPTPKKK 


7069 


1147 


1765 


FRDHRRYFYVNEOSGESQWEFPDGEEEEEESCAOENRDETliAXQ 
TbKDKTGTDSNSTESSETSTGSbCKESFSGOVSSSSbMPLTPFW 
TLLOS2^VPVbOPPLPbEMPPPPPPFPES?PPFPPPPPAPKMPPP 
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SEQ 
ID 

NO: 


Predicted 
beg an nine; 
nuci eotide 
Iocs t i or. 
corresponding 
to firs; 
amino acid 
residue of 
amino acid 
sequence 


Predicted end ! 
nuci eotid* 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aepartic Acid, E= 
Glutamic Acid, F» Phenylalanine , G=Glycine, 
H^Histidine, l=lsoleucine / K=Lysine, 
Ii-Leucine, M=Methionine, N=Asparagine , 
p=Proline, Q=Glutamine, R=Arginine, 
S-5erine, T=^ Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






1 EKTKKGRKDKAKKSKTKMPSLVKKWQS1QRELDEBDNSSSSEED 
j KVSTAQKRIEEWKOQOLVSGMAERNANFEA 


7070 




547 


PG7f4 EDSE AVQRATAL I EQRLAQEEENEKLRG DARQKL PMDLLV 
LEDEKHHGAOSAAbQKVKGOERVRKTSLDLRREHDVGGIQNLI 
ELK KXRKCKKRDALAASHEP PFE PEE 1TGF VDEETFLKAAVEGK 
M KV 1 E K FLADGG S ADTCPQFRRTALHRAS LEGUMElLEK LLDNG 
ATVDFQ 


7071 




021 


ARGTLRALETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTELNSVPCKSSPFLTRVPAYPPHSENIOY 
FODPRTOIPFEVPOYPOTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSI.PPASMPYADKYSTFSPRDRMNSSPYOPPPPOPYGPVPPV 
FSGMYAPVYDSRRIWRPPMYORDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
2 RRK PDQWAQY HTQKAPLVSSTLP VATQS P7 P PSTLNRGEGS 


7072 




921 


ARGTLRALETAXKVG KVG ANGQKAAG PS ADS VTENKI GS P PKTP 
VSNVAATSAGPSNVGTELNSVPOKSSPFLTRVPAYPPHSENIQY 
FODPRTOIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSmiV 
PESSI^PASMPYADHYSTFSPRDRMNSSPYCPPPPQPYGPVPPV 
P?GMYAPVYDSRRIWRPPMYQRDDI3RSNSLPPMDVWJSSVYQT 
S L.RER Y N S I JXS Y Y S V ACQP P S E PRTT VPLP RE P CGH L.KT S CEEQ 
iRRXPDO>3AOY}lT0KAPLVSSTLPVATOSPTPPSTLNRGEGS 


7 073 


SO 


S04 


LAHGSFGVSDFPAPAAAPAHTbTSFSGSLSPOFRKPLGRAPAMP 
LVRYRKW3LGYRCVGKTSLAHOFVEGEFSEGYDPTVEKTYSKI 
VTbGKPEFHLHLVDTAGQDEYS I LPYSFI 1 GVHGYVLVYSVTSL 
HSFQVI ESLYQKLHEGHGK 


7074 


261* 


1003 


VCPVLCSTROEPGMSSLVTYFGKPTRRKEFLLGHCIAAGKMNIS 
VDLETN YAELVLDVGRVTLGENS RKKMKDCKLR K KQNER VSRAM 
CALLNSGGGV J KAE I ENEDYS YTKDG I GLDLENS FSNILL FVPE 
V LDFMQNGN YFL I FVXSWSLNTSGLR I TTLS SKL YKRDI TS AKV 
MKWAALEFbKDMKKTRGRLYLRPELLAKRPRVDlQEENKMKAL 
AGVFFDRTELDRKEKL.TFTESTHVEI 


707S 


59f> 


:oos 


NY1NFFFRKEYPPKV0KVEINPVRLSRLOGVERIMKKTEESESQ 
VEPEI KR KVQQKRhCSTYQPTPPLSPAS KKCLTKLEDLQRNCRQ 
AT TLKESTGPLLRTS I HQNSGGQKSONTGLTTKKFYGNNVEKVP 
1DII 


707€ 


279 


1049 


LOSES SN AAEGNEQRHEDEQRS KRGGWS KGRKRKK PLR DSNAPK 
SPbTGYVRFMNERREOLRAKRPEVPFPEITRMLGNEWSKLPPEE 
KORYLDEADRDKERYWKELEQYQKTEAYKVFSRKTQDROKGKSH 
R0SAARQATHDHEKETEVKERSVFDIPIF7EEFLNHSKAREAEI* 
RQLRKSNMEFEERllAALOKHVESMRTAVEiCLEVDVlQERSRNTV 
L0OHLETLROVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 




1119 


S SMGSNSE I NGliALRKTDKYGFLGGSOYSGSbKSSl PVDVARQR 
ELXWIDMFSNWDKWLSRRFOKVKLRCRKGIPSSLRAKAWQYLSN 
S KELLEQNPRKFEELERAPGDPKWLDVI EXDL.HRQFPFHEMFAA 
RGGHGQODLYRI LKAYTI YRPDEGYCQAOAP VAAVLLMHMPAEQ 
AFWCXVQICDKYLPGYYSAGLEAI0L.DGB1FFALLRRASPLAHR 
HLRRQR I DPVLYMTKWFMCI FARTLPWAS VLR VWDMFFCEGVKI 
2 FR VAL VLLRHTLGSVEKLRSCOGM YETME0LRNLP0OCM0EDF 
LVHEVrNLPVTEALIEREKAAOLKKWRETRGELOYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


4 83 


767 


FOGQRMAGEQKPSSNLLEOFILLAKGTSGSALTALISQVLEAPG 
VYVFGELLELANVOELAEGANAAYLQLLNLFAYGTYPDYIANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGP3DVGOBG01>SOMARPLSTPSSSQ 
NpAR KXKRG 1 1 EKRRRDR INSSL* EkRRLVPTAFEKQGSS KLEK 
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W=Tryptophan, Y=Tyrosine, X=Ur.known, *=Stop 
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AEVLOMTVDKLKMbHATGGlGTHALLFOASFIQQl f 


7080 


200 


595 


VQLFLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPODGJPY 
LTHFLCHQDVVSVGRLOIRALATPGHTOGHLVYLLPGEPYKGPS 
CbFSGDLLFbSGCGEFPRKREEbGEEGETE VRAATVPWRALKP 


7081 


215 


506 


AVTEL'EMILNSLSLCYHNKLILAPMVRVGTbPMRLUU.DYGADl 
VYCEEL3 DLKMI QCKR WNEVbSTVDFVA PDDR WFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


APSRNTMLMAWCRGPVLLCLROGLGTNSFLHGLGOEPPEGAJ^SL 
CCR SS PRDLRDGEREH EAAQR KAPGAES C p SLPLS I S D I GTGCL 
SSLENbRbPTbREESS PREbEDS SGDQG R C GPTHOG S EDPSMbS 
QAQSATEVEERHVSPSCSTSRERPrOAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGK1 VGKFPG01 bRSSFGKQYMLRRP 
ALEDY WbMXRGTAl TFPKD3 NM3 hSWD 1 NPGDTVbEAGSGSG 
GMS LFLSKAVGSOGRV I SFE VR KDHHDLA K KNYKHWRDS WK1>SH 
VE EWFDNVDF 1 HKDI SGATED1 KS LTFDAV ALDMLN PH VTLPVF 
Y PHbKHGGVC PVYWN I TQV 1 ELLD 


7083 




541 


RSKAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTOOLLSEP 
S PKAP RARPC R VS TADR SVRKG 1 MAY SLEDLLLKVRDTLMLADK 
PFFbV bEEDG TTVETEE Y FQA b AGDT VFMV bQKGQKW Q P PS EQG 
TRHPbSbSHK 


7084 
• 


3 


52i 


NSVS VSS0SR FZASVPGTGVCRSAAADMAASTAAGKOR I PKVAK 
VKNXAPAEVQlTAEOLLREAKKREbEbbFPPPOOKlTDEEEbND 
Y KLR K R KTFEDN I RKNR TV 1 SN W 2 K YAQWEE SLKE 3 0RARS I YE 
RALDVDYRNI TLWLKYA£MEMK*JRQVNHARN I WDRA1 TTb 


7085 


243 


1499 


RQbARbRRRGWRSPFGGAPMAH 1 T 1 NQYbQQ-VYEAl DSRDGASC 
AEbVSFKHPHVANPRLQMASPEEKCQQVbEPPYDEMFAAHbRCT 
YAVGNHDF I EAYKC0TV 1 VQS F bRAFOAH KEENWAbP VM YAVAL 
DLRV FANNADOCbVKKG KSKVGDMbEKAAEbbMSCFR V CASDTR 
AGIEDSKKWGMbFbVNQbFKl YFKINKbHbCKPblRAl DSSNTL.K 
DDYS TAQR VT Y K Y YVGR KAM FDSDFKOAE E YbS FAFEH CHR S SQ 
KUXRM3 bl YbLPVK^bGHMPT\nELbKKYKbMQFA£VTRAVSEG 
MLLLLHEAbAKHEAFFI RCG 1 Fbl bEKbK 3 1 TYRNbFK K VYbbb 
KTKObS bDAFbVAbKFMQVEDVDl DEVQC3 1.ANLI YMGHVKGYI 
SHQHQXbWSKQNPFPPbSTGC 


7086 


25C 


525 


ILAARJ4GK0WS KLRPEVMODbL^STDFTEHElQEWYKG FbRDCP 
SGHbSMEEFKKI YGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 
EF 


7087 


166 


723 


bSGSSAGKVAAPCVPPSNHEbVPJTTENAPKNWDKGEGASRGG 
NTRKS bEDNGS TR VTPS VQPHbQP 3 RNMSVS RTMEDSCELDbVY 
VTER 1 I AVSFPS TANEENFR SNLRE VAOMbKSKHGGNYbbFNLS 
ERRPDI TKbKAKVbEFGWPDbKTPAbEKI CS I CKAMDTWbNAHP 
HRCRVbHNKG 


7088 


1 04 


759 


rtc AA «; P^SIJ .FMAftEl TETGE1 .Y£<; YVGLVYMFNL.I VGT GAbT 
MPKAFATAGWLVSLVbbVFLGFMS FMTTTFV I EAMAAANAQLHW 
KRWENbKEEEDDDSSTASDSDVblRDNYERAEKRPIbSVQRRGS 
PN P FE I TDR VEMGQMASMFFN X VG VNLFY FCI I VYX. YGDLA I YA 
AAVPFSbMQVTCS ATGNDSCGV EADTKYNDTDRCWG PbRRVD 


7089 


33 


1775 


S VCWEDRYbKARMEESPbSRAPSRGG VNFbWVART Y 1 PNTKVEC 
HYTbPPG-mPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
SP I HTS VQFGASYbPKPGAQbY 0FRYVNRQGQVCGOSPPFQFRE 
?RPKDEbVTLEEAIX3GSDIbbVVPKATVl^N0I^ESO0ERNDLM 
QbKLOLEGOVTELRSRVQELERAbATARQEHTBLMEOyKGISRS 
HGE1TEERDILSR0OGDHVAR 1 bEbEDDl CT1 SEKVbTKEVEltD 
RLRJITVKAbTOEQEKLlCQbKEVQADKEQSEAELQVAOCETWHb 
NLDbKEAKSWOEEOSAOAQRbKDKVAOMKDTLGQAOORVAEbEP 
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(A=Alaninc, C=Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, F= Phenylalanine , G=Gjycine, 
H=Histidine, 2=Isoleucine, K-Lysine, 
L- Leucine, K=Keth ionine , N=Asparaoine, 
F=Proline, 0-Glutamane, R^Arginine, 
S^Serine, T* Threonine. V^Valine, 
W= Tryptophan, Y=7yrosine, X*=Unknown, *~Stop 
Ccdon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






lk eqlrgaqe laassqqkatllge elas aaaardrt 1 aelhrsr 
levaevngklari«glhlkeekcov;skeragll»osvl'aekdkilk 
lsaeilrlekavoeertonovfktelarekdsslvolseskreC 
telrsalrvlqkekeqlqeekqelleymrklearlekvadekwn 
edattedeeaavglscfaaltdsedsspedmrlhpnafvsvetq 
aslllgle 


7090 


33 


1775 


SVCWEDRYLKARKEESPLSRAPSRGGVNFLNVARTVIPNTKVEC 
H YTLP PGTMPS AS DWI G I ? KVE AACVRD YHTFVWSS V PESTTDG 
SPIHrSVQFQASYLPKPGAQLYQFRYWRQGQVCGQSPPFQFRE 
PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESOQERNDLM 
OLKLQLEGQVTELRSRVQFXERALATASQEHTELMEQYXGISRS 
HGEITEERD3LSRQQGDHVAR1 LELEDD1QTISEKVLTXEVELD 
RLR DTVKALTR EOEKLLGOLKEVOADKEOSEAELQVAgQENHHL 
KLDLKEAKSWOEEOSAOAORLKDKVAOMKDTUSOAWRVAELEP 
LKEQLRGAQELAAS SQOKATLLGE E IAS AAAARDRT 1 AELHRSR 
LEVAEVNGKliAELGLHLKEEKCOWSKERAGliLQSVEAFKDKILK 
LSAEILRLEKAVOEERTQNQVFKTEIJ^EKDSSLVOLSESKREL 
TELRSALRVL0KHKE0L0EEK0ELLEYMRKLEARLEKVADEKWK 
EDATTEDEEAAVGLSCP AALTDSE DE S P EDMRLHPKAFVS VETO 
ASLLLGLE 


709: 


186 


1070 


EGMLTREHRCGRSEEOELEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGOS LVP vy 1 YS PE YVSMCDSLAK I P KRASMVHS LI EAY 
ALHKOMRIVKPKVASMEE^TFHTDAYLOHLOKVSOFGDDDHPD 
S3 E YGLGYDCPATEG1 FDY AAA! GGATI TAAQCL1 DGMCKVAIN 
WSGGWHHAKKDEASGFCYLNDAVLGILRLRRKFERILYVDLDLH 
HGDGVEDAFSFTSKVMTVSLHXFSPGFFPGTGDVSDVGLGKGRY 
YSVNVPIODG10DEKYYOICERYEPPAPNPGL 


1092 


522 


HC9 


K0GINEDQEBSOKFRLGEGCEP3SKR0MKKLIK0KOWEEORELR 
KQKR KEKR KRK KLEROCQMEPNS DG HDR KR VR RD WH S TLRLI 1 
DCSFDXLM 


7093 


454 




NFGVSGVELAOUAbMVRKSFVIAACQLVLGLLMTSLT ESS IQNS 
ECPQLCVCEIRPWFTPOSTYREA 


7094 


2 


50t 


FVRSMHWGVGFASSRPCWDLSWN0S1SFFGWWAGSEEPFSFYG 
Dl 3 AFPLODVGG 1 MAGLGS DP WWX KTLY LTGGALLAAAA Y LLHE 
LLV I RKQQETDS KDA1 1 LKQFARPNNGVPSLS PFCLKMET YLRM 
AJDLP YON YFGGK LSAOGKM PW I E Y NHE KVSGTEF 1 1 


709S 


1 


411 


I ASSLPXMASLLCSDR VLYLVQCSEK KVRAPLSQLY FCfc YCSELR 
S LE CVS HEVDSH Y C PS CLENKJPS AEAKLXKNRCAN CFDC PGCMH 
TLSTRATS ISTQLPDDPAKTTMK KAYYLACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2C67 


ETR S LAVQE K PS CAGR R R S S R I S F AG ALFLTR FLLQ E L»LLNN FC 
5 AM S P APDAAP A PAS 1SLFDLS ADAPVFQGLS LVSHAPGE ALAR 
APRTSCSGSGERESPERKLLQGPWDISEKLFCSTCDQTFQNHQE 

EDSDSASEEDLOTLDRERATFEKLSRPPGFYPHRVLFCNAQGCF 
LYAYRC\nLGPHOD?PEFJ^j:LLLC?NLOSXGPRDCVVLKAAAGHFA 
GAI FOG R E WTH KT FHR Y T VRAKRG TAQG LRD ARG GPS HS AG AN 
LRR YNEATLYKD VR DLLAG PS WAKALE EAGTI LLRA PRSGRSLF 
FGG KGAPLQRGD PR L WD J PLATRR PTFQELQR VLH KLTTLHVYE 
EDPKEAVRLHSP0THWK7VRE2RKKPTEEEI RK3 CRDEKEALGO 
WEESPKOGSGSEGEDGFQVELSLVELTVGTLOLCESEVLPKRRR 
RKRNKKEKSRD0EAGAHRTLLQ0T0EEEPST0SS0AVAAPLGPL 
LDEAKAPGQPELWKALLAACRAGDVG VLKLQLAPS P.ADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSVVRLLLEAGADPTVOCQDH 


7097 


256 


1228 


I RTKSAATWEA.WPOCGREGSRI ITEPCEANAGSRQELGTBRISS 
FLAAQGDQAFHSGLETNKSNSELPLRVGLKVAQGS PLMGGOVSA 
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Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) ! 








SNSFSRLHCRNA.NEDWMSALCPRLWDVPLKHLSIPGSHDTMTYC ' 
LNKKSPISHEr5RLbQLLNKALPClTRPWLKWSVT0ALDVTEQ | 
LDAGVRYLDLR : AHML EG SEKNLH F VHMVY TTAL VE DTL TE 3 S E : 
WLERHPREW J LAC RN F EG LS E DLHEY LV AC I KN I FGDMLCPRG ! 
EV PTLRQLWSK G 3QV I VS YEDES S LRRHKELWPG VP YV WGNRVK 1 
TEALIRYLETMKSCGR ] 


709S 


82 


95fc 


SS FLKRCRKV h< : CWG1 PSEQSLFSTLiEEPRDKEI DNYC VMRljQT 
EAR SG FWAPNR FPVN1CR KTAVDGDRGGSS RETCRCH FH PS LEA 
LVLLLQDWQPGGVGICTSFLGISWALLDYHRALRTCLPSKPLLG 
LGSSV J Y FLWK 2 .1 «LLVJ P RVLAVALFSALF P SYVALH F LGLWLVL 
LLWVW LQGTDFK FOPS S E WLYR VTVATI LYFS WFNVAEGRTRGR 
AIIHFAFLLSD5 I LLVATWVTHS S WLPSG3 PLQLWLFVGCGCFF 1 
LGLALRLVYYHWLHPSCCWKPDPDOVD 


709S 

~7i6~ 




210 


LFRLAPGFLRSLARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
ARS LPR SPARPG FNDALLGEHDFRGQGVRAQR FRFSEEPG PGAD 
GAV1,EVHVPQI GAGVSLPGI LAAKCGAEV1 LSDSSELPHCLEVC 
ROSCQMNNLPHLOWGLTWGHISWDLLALPPODIILASDVFFEP 
EDFEDI LATI Y F^MHKNPKVQLWSTYQVRSADWSLEALLYKWDM 
XCVH1 PLESFDAI^KEDl AESTLPGRHTVEMLV1 SFAKDSL 


205 


671 


ANGGFWEAAPGS I-VSLPLWVPTASHSKTTALGIGSAPPPHLSVL 1 
FLFSFPPQLGPP LEAFP VFK KYDRNGLKVS 3 ECKRVSGLEPATV 
DWAFDLTKTNMCTMYE0SEV3GWKDREKREEMTDDRAWYL1 AW EN 
SSVPVAFSHFRFDVERGDEVLYW 


7101 


2 




WRGG PR RAKRLA GG AVGWVLLVRG VHS VRAGGGR P P RAADMKKD 
VRILLVGEPRVGKTSL3MSLVSEEFPEEVPPRAEEIT1 PADVTP 
ERVPTHIVDYSKA^QSDEOLHOEISQANVICIVYAVNNKHSIDK 
VTSRWIPLINERTDKDSP.LPLILOGNXSDLVEYSR 


7102 


2 


502 


WRGGPRRAKRL>.GGAVGV?VLLVRGVHSVRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSL2MSLVSEEFPEEVPPRAEE1TJ PADVTP 
ERVPTHI VDYSE AEQSDEOLHQEI SQANVI CI VYAVNNKWS I DK 
VTS R W 1 PL 1NER 1 DKDSRLPL1 LGGNKSDLVE Y S R 


7103 


IIS 


438 


GSOSSVAVNIRSGTDEESMDLMNGQASSVNIAAtASEKSSSSES 
l^DKGSELKKSF^AWFDVLKVTPEEYAGQITLMDVPVFKAIQP 
DEL£SCGWNKKE>; YSSAF 


7104 


167C 


79b 


rlwehrsvsaga5 gwglss pgclllhpslpeeervdi l j nnagv 
mrcphwttedgff;wfgvnhlgeakagaapwvqailprrpp*kvl 

GF* V* VKSDLF3 3 LNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
KVAGHIDFDDLNWQTRKYN7KJXAYC0S\KLA1VLFTKELSRRLQ 
GSGVTVNALHPG VARTELGRHTGI KGSTFLOHHN\ WAJILLAAWS 
KS PRSWPAPAQHrJTLAVAE ELA\VI SGKYFDGLKOKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREOPLPR 


7105 


765 


143 


G0MCRR PS PKST5- CLSMTCDLP /RGL0DP0CLALFR VAVDKHOA 

S01 P VQQMHLFD VHN Y PD YV S SGGGFGP ADDHG YG VS Y I F MG DG 
MITFH I SSKKSST KTDSHRLGQHI EDALLDVASLFQAGQH FXRR 
FRGSGKENSRHRCCFLSRQTGASKASMTSTDF 


7106 


14 


1064 


GLOAGHPHPRSAS Rl PEADTH\YSKLQRAFDS1 VUKDHKRMFGT 
YFRVGFFGSKFGDZ.DEQEFVrKEPAITKLFEISHRLEAFYGOCF 
GAEFVEVIKDSTPVDKTKLDPNXAYIQITFVEPYFDEYEMKDRV 
TY FEKN FNLRRFKY TTPFTLEGRPRGELHEOYRRNTVLTTMHAF 
PYrKTRISVrOKFEFVLTPIEVAlEDMKKKTLOLAVAINOFPPD 
AKMLOWVLQGSVGATVNG^PLEVAOVFljAElPADPKLYRKKNICL 
RLCFKEF1 MRCGE^VEKNKRLl TAD0REY00ELXKNYKXLKENL 
R?M 1 ERK1PELY K P I FRVESQKRDS FHRSS FRKCBTQLSOGS 


7107 


114S 


591 


* I * WLQTGKXK 



597 



BNSDOCID:<WO._01S3312A1_I. > 



WO 01/533)2 



pct/ijsoo/:u?(»3 



r heq 

ID 

NO: 


I- redacted 

beginning 

nucleotide 
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/■.'r.:no acid sVctnent containing sianal po;::ice 
{/-.-Alanine, C-^Cysteine, D«Aspercic Acic, L= 
Glutamic Acid, F- Pnenylaianine / G=Glyc^r.'., 
H-Kisticine, j =lsoi cucine , K=Lysine, 
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Codon, Aposrible nucleotide deletion, 
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7108 


1 




V K V ALLLTN LEO PR T ES E W F. NS PT L» KMFLFQF VNLNS S 7 F VIAF 
1LGRFTC-HPGAYLKUNRVJRLEECHPSGCL1DLCMQMG J 1 KVLK 

OTt-"MMFMn rVT)T T PNUUTPPWVPnFHflOPG K CTTDr^lVP V nVKTf 
- t»r«INrn£.iA>I ri-».i U'"™ : KKI\vnije.riVjt > T»r»f\^ j>r r'yiM',.\l..yi\JJL > 

CPMNAYGLFDEYLEMILOFGFTTlFVAAFPLAPblALLN^IIEI 
RLDAYX FVTQWRR FLASRA K 01 GI WYGILEG 1 G J LS V I Tf : « FV I 
A3 TSDFI FKLVYAyKYGPC/^GC>GEAGOXCMVGYVNASLSVFR IS 
DKENRSEPESDGiTEFSGTPLKYCRYRDYRDFPHSLVPYGYTLOF 
Wr. \ LAW 


7109 

j 


964 


102 


Wt-ORKR^SbVPGPWlGPAOEEPWKKKESlJGJ^OEAl^lCW^ 
TOPFPKSEOVYLHFLSWTEDGPEPKDKGSLPQPPITEVLfiQVF 
PEKLATDTSTFEATSEGTLELO0RNPKAERLRWSPAQEESFRQM 
W J HKEI PTGKKDKECSECG KTFI YNSHLVVHCR VHSGEK P YKC 
SDCGKTFKCSSNLGCHQR I HTGEK PKECNECGXAFRWGAH LVOH 
GR J HSGEK PYECNECGKA FSQSS YI.SQHRR1 HSGEKPFJ CK5CG 
KA.YGWCSEL.I RHRR VHARK EPSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERHIVKPbYURYRLVKOMLTRASITFVLG 
SY STXRRGQMLQPJ 1EGE7AHFFEEI XEEEEDGVNLSSEi.GDML 
KTAVQVC? SLKN £ £ SDVE E NQE KLA1 .DLRLSS S RAASMP F. 1 /LHQ 
LW KARAEK KKLR KTLR EFEFAFYOONGRNAQKEDR VP VLF E YSE 
YKKIKAKLRLLEVLISKQDSSKSl 


7111 


2 


414 


Gi ; GLYRGFTPGG0Cl WKPKSMPPDHERNFGFTCFALELNF] ,TAE 
LKRSLPSTDTRLR PD0R Y L E EGN 1 OAAEA0KRR I EObOR I;R R KV 
MLENN I VKQARFFKRQTDS iJGKEWW VTNNTYWR LRAEPG Y GWMD 
GAVLW 


7112 


103 


49S 


PRCFPVADRGRLIGGLPDVX'TIMEGKTLMLTCTVFGNPDPEVIW 
F KNDQD J OLS EH F F VKVE OAX Y VS MT I XGVTS EDSGKYS 3 ?v J XN 
KYGGEKIDVTVSVYKHGEX1 PDMAPPQQAKPKliIPASASAAGC. 


7133 


1 


024 


" K CLRQAWHEAPS 5 LAFTR WCSREERAEGGGNLHRS I TRDP K P PG 
LF.FSQR?MDDKKKKRSPKPC^0PAOAPGTLKRVPVPTSHSGSI> 
ALGLPHr,PSFKORAKFKRVGKEKCRPVL.AGGGSGSAGTPLOHSF 
I,TEVTDVYENEGGLLNLLND?HSGRLQArGKECSFEQLE)A ? REM 
OEKIjAJiLHFSLDVCGEEEDDEEEEDGVI'SGLPEEOKKTKADRWL 
DC/L.LS NLG SCLGALVPGGM R6GEGT Y SQS HS WA LG EtCVG VHG S K 

S h>C- P L»N L>F R R 


7114 


3 


1492 


WiEVDEQI DHYKESQDKFLVJCAAF1GKETLKDRSGQECK"? CK Kl 
I YLNTDFVSVKORLPKYYSKERCSKHHLNFLGONRSYVRKKr'DG 
CKAYWKVCLHYNLKK-^OPAERr FDFNQRGKALH0KQALRKSORS 
QTGEKLYKCTECC'KVFlQKAi^hVVHQRTHTGBKPYECCECAKAF 
SOKSTLI AHQRTHTGEKPYECSECGKTFI0KSTL1 KHOR'J HTGE 
KPFVCDXCPKAFKFSYHLIRHEKTHIRQAFYKGIKCTTSSl.IYQ 
RlHTSEKPOCSEHGKASDEKPSPTKHWRTHTKENIYECSKrGKS 
FC^V^WT.QVWriBTUTOFKPYFC'^T CnKTFSGK c HLSVHPR7PlTG 

4 T\\^ i-> O V rl\/I\ .ill i CEftr I L/V V A 'wOM r »)N3I\v»"V » Hiif^ 

EKPYECRRCGKAFGEKSTL.IVHQRMHTGEKPYKCNECGKAFSEK 
SPL3XHQRIHTGERPYECTDCKKAFSRXSTLIKH0R1HTGEKPY 
KCSECGKAFSVK5TLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
KHQRSHTGDXNL 


j 7115 

i 

j 

1 


1 


947 


JJAAHGYNWGLWCMY 1 1 PPQDWLDRGDESAP I RTPAMI GCS F WD 
REYFGD1GLLDPGMEVYGGENVKLGMRVWQCGGSMBVLPCSRVA 
HI ERTRKPYfWDlDYYAKR^ALRAAEVWMDDFKSHVYMAWKl PM 
SN PGVDFGDVS ERLiALRQRIjK CRS FKVTY1>ENVY PEMRVY IWTLT 
YGEVRHS KASAYCLDQGAEDGDRA3 V* PCHGMSSQLVRY S ADGL 
U>LGPLG5TAFLPDSKCLVDDGTGRMPTLKXCEDVARPTQR1,WD 
TTQSGPI VS RATGR CLEVEMS KDANFGLRLWORCSGQXtWl RN 
WIKHARK 


| 7116 


866 


95 


RVRMRRNAEVIEEXLSMKSWAXFRPGEPWKGYPNlDPEfDx-VVT 
PGS VimaLS I NTVR EVDHLRDRNSGSS S SLNTTLPS TSAWS S I R 
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!>CT/l;S'»>/342fi/ 



SEQ 
ID 
NO: 



Precictec 
begin nine 
nucleot ido 
locat : on 
corrr spending 
to first 
amino acid 
residue of 
amino acid 
sequence 



7117 



7118 



7119 



49 



Predicted end 
nucj ectide 
loca tion 
corresponding 
to first 
amino acid 
residue of 
amino acic 
seauence 



1261 



186 3 



1863 



Ammo ac: c reoment containing signr, . peptide - " 
lA-Alanine, C=Cysteir,e , D=Aspartic Acid, l 
Glutamic Acid, F» Phenyl alanine, G-C-."; ycine , 
H=Histid>ne, 1 - 1 soleucine , K=Lysi:ie 
I L-Leucine , M=Methiomne, N-Asparac : nt , 
P= Proline, Q=Glutamine , R=Argininc . 
S=Serine, T=Threonine, V^Valine, 
W = 7ryptopr.an, Y=Tyrosine, X=Unkncwr. , *=Stcp 
Codon, /^possible nucleotide deletion, 
X^possible nucleotide insertion) 
ASNYNVFLSS TAQSTS ARNSbS KbTWS PGSVTN TFbABELWKV V~~ 
L P P KN 1 TA P S R P P PG LTGy K F P LS TVS DNS PLR 1 GC G W GNSDAK V 
TPGSSWGESSSGRITNWLVLKNLTP0IDGSTLRTLCW0HGP1JT 
FHL.N bPHGNAbVRYSSXEEVVKAQKSI.>HlSDLFbb71 



Lb3 STPGGCKPPPSS I EFTYTGAWGKALPAPHMPGA FGAbFCXi 7T 
FVSQAARA2 7 bbQPSQAAQAF.GbSQPARACGALCSL ? WPLRNV-'G 

? pi lrlpgglktptndrktrtksamacwaraqwdtlgplkls;:r 

CXVCbRHPRPTGVRGGPGAAGRCXJGMGTRRRGTFTSG/vRDPGCb 
RVKHRCOPTGHLP 



Tl!CEPNPGA(?AMVLLHVLr EFiAVGYAUALKEVKk 3 ±LLQPQ\'Z 
ES VLNbGKF H S 1 VR LVA FCP FAS S Q VALENAN AV EG WHEDbR 
M,LETHLPSKKKKVLLGVGDPKIGAAIOFELGYNCCTGGVIAi: 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNKVD 
NM1 1QS1SLLD0LDKD1NTFSMRVREWYGYHFPKLVKI INDNA7 
Y CRLAQK1 GNRR EbNEDKbEKbEELTMDGAKAKA 1 L V ASRSSMC- 
MD1 SAl DL1K 2 ESFSSRWSbSEYRQSLHTYbRS KKS OVAPSbS 
AL 1 GEAVGARbl AHAGSbTNbAKY PASTVQ1 LGAE KAbFRAbKT 
RUNT PK Y GL i FHSTF J GRAAAKNXGR I SRYliANK Ci 1 ASR I DC F 
SEVPTSVFGFKbKEOVEERJ.SFYETGElPRKNbDVKKEAMVOAt 
F^J^EITRKIXKQEKKRbKKEKKRbAALAbASSENr-.'.^TPEECE 
EMSEKPKKKKKOKPOEVPQENGKEDPSlSFSKPKKKKSFSKELb 
MSSDbEETAGSTSlPKRXKSTPKEETVNDPEEAGV.P.SGSKKKKK 

P S KEEP VSSG P EEAAGKSSSKKKKKFHKASOED 

PHCE PN PGAGAHV bLHVbFEHAVGY AbbAbKEV EE) ^ bbQPQV :• . 
T.S VLNbGXFKS J VRbVAFCPFASSQVAbENANAVSf CWHEDbJx 
U.»bETHbPSXKKX\TLbGVGDFKlGAAIQEEbGYNC07GCVIAE2 
LKGVRbHFHNl J VKGbTDLSACKAObG^GHSYSRAKVKF^A7NPV:. 
NK3 20S1SL1.D0LDKD1NTF£WRVREVIYGYKFPEI,V K3 INDNA/J 
YCRl^OFlGNRRELNEDKbEKbEEbTMDGAKAKAIbD/^SRSSyiG 
KDISAlDLlKlESFSSRWSbSEYRQSLHTYLRS^SOVAPSLi; 
Ab 1 GE AVGAK L I AHAGSLTNLiAKY PA^ TVO I bG Ac KALFRAbK ? 
RGNTPKYGblFHSTFlGRAAAKNKGRISRYbANKCR J ASRIDCf 
SEVPTSVFGEKbREQVEERbSFYETGEIPRKNbDVKKtAMVOAi: 
EAA/AEITRKbFKQEKKilbKKEKJCRLAAiJUA£SENE£STPEECE 
EMSEKPKKXKKOXPQEVPQENGMEDPSISFSXPKKKKSPSKEEL 
MS S DbEETAGSTS I PKRKKSTP KEETVNDPEEAGH kf-.GS KKKK K 
FSKEEPVSSGFEEAAGKSSSKKKKKFHKASOED 



7120 



XS91 



64 



QbGTRRCbRGDKVTNAl^DFbVTNLEPRF I EPQTAK i .S WFKDi 
NS TTPb I FVbS PGTDPAADb Y K FAEEMKFSKKLSA. I S 1/3QGQGI 
RAEAMMRSSl ERGKWVFFQNCHbAPSWMPALERb J FK3NPDKVH 
R DFRL WLTS LPS NKFP VS I LONGS KMT I EP PRG VRAA T I>LKS YS 5 
LGECFLNSCKKX'MEFKSbbLSbCbFHGNALERRKFGPbGFNIPY 
E FTDGDbRI C ? SQLKMFLDE YDDI PYXVLKYTAGE 1 N VGGRVTr 
D>JDRRCIMNIbEDFYNPDVLSPEHSYSASGIYHOIPFrYDLHGY 
LS Y 3 K S b P LN DK PE I FGbH DN AN I TFAQNETFALbGT 3 1 QbQ P K 
SSSAGSQGREE1 VEDVTQNI bbKVPEP J NLQWVMAK YPVLYEES 
MNTVbVOBVJRYNRbbQVITQTbODLbKAbKGbVW^SSObEbMA. 
AFbYNNTVPEbWSAKAYPSLKPLSSVfVMDLLQRbDFLCAWIQDG 
I PAVFWlSGFFFPOAFbTGTbONFARKFVISIDT* SFDFKVMFE 
APSELTQRPQVGCYIHGbFLEGARWDPEAFQLAESOFKEbYTEM 
A V I WbbPTPNR KAQDQDFYLC P I YKTLTRAGTLS TTGHSTNYV 3 
AVE I PTHQPQRHWI KRGVAL1 CALDY 



7121 



546 



RPbRPWVbSbGSMVGbMTYGRRQFOSbDTTMRRblPPPREASAK 
LTTbVDADAEAFTAYbEAMRbPKNTPEEKDRRTAAbOEGLRRAV 
S VP bTbAETVAS bWp ALCELAR OGNLACRSDbQVAA KAbSMG V F 
GAY FNVLINbRDI TDEAFKDQ I HKRVSSLLOEAKTO AAbVLDCL 
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SEQ 
ID 
NO: 


Predicted 
beg inning 
nucleoti de 
lccat ior, 
corresponding 
to tirst 
amino acid 
residue of 
amino acid 
sequence 


Predicted tnd 
nucl eotidc 
locat ion 
correspond: no 
to first 
amino acic 
residue of 
amino acid 
sequence 


Ammo acid segment ccntr-ininc signal peptide 
(A*Alanine, C=Cysteine, D=Asp&rtic Acid, E- 
Glutamic Acid, F=Pheny3 alanine , G=Glycine, 
H=Histidint-, 3 ^Isoleucine , K-Lysmt, 
l,=Leucine, M=KeLhionane , N^Asparag: ne , 
P=Proline, 0=Glutamine , R=Arginine, 
S -Serine, T^Threonine, V=Valine, 
W= Tryptophan, Y*Tyros;nc, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-po6sible nucleotide insertion] 








ETRQE 


7122 


/. 


S46 


R PLK PW VLS1>G SMVGLMTY GK RQ FQ S L.DTTMR RL 1 PPFJTeASAK 
LTTLVDADAEArTAYLEAMRLFKNTPEEKDRRTAALOEGLRRAV 
SVPLTU\ETVASLWPA1>0ELARCGNLACR£DLQVAAKALEMGVF 
GAVFNVLINLRDITDEAFKDOIHHRVSSLLOEAKTQAALVLDCL 
ETRQE 


7123 1 


3 


1092 


KPAVPEARSAGTSEAGRSGAEFVSCGSVSGDGAAMRLTPRALCS 
AA0AAWRENFPLCGRDYARWFFGHMAXGLKKM0SSLKLVDC1IE 
VKDARIPLSGKNPLFQETLGLKPHLLVLNKMDLADLTEOQKIMO 
HLEGEGLKNV] KTNCVKDENVKCT 1 PMVTELIGRSKRYHRKENL 
EYC3MVIGVPKVGKSSL1NSLKR0HLRKGKATRVGGEPG1TRAV 
MSKIQVSERPLMFLLDTPGVl*APRIESVETGLKI»ALCGTVLDHL 
VGEETMADYLLYTLNWJORFGYVOHYGtGSACDNVERVLKSVAV 
KLG KTQKV KVLTGTGNVNV IQ FN Y P AAARD F LQT F RKGLLGS VM 
LDLDVLRGKPRV 


7124 


2 


382 


LPLTLLLAAPFAHLLLPPGHD^SPCWUPGPALSPGTLGPLSWAM 
ANSGLQLIjGYFUALGGWVGI IA.STALPQWKOSSYAGDASIQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVKSEKGFWS 


7125 


I6t 


3127 


NClSEKRNYSFSMOKGKGRTSRjRRRKLCGSSESRGVNESHKSE 
FIELRKWLKARKFODSNLAPACFPGTGRGLMSQTSLQEGQMIIS 
LPESCL»LT\RDTVIRSYLGAY] 'i KWKPPPSFLLAL.CTFLVSEKH 
AGHRSLLEA\Y1jKI LPKAYTCPVCLEPF.WNLLPKSLKAKAE50 
RAHVQEFFASSKDFFSSLQPLFAEAVDSIFSYSAU.WAWCTVNT 

RAVYL\ S PG SGN AFLOS RTPVCbA PY LDLLNHS PHVOVXAAPNE 
ETHSYEIRTTSPWRKHEEVFICYGPHDNQRLFLEYGFVSVHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFI VPSPARRCSQXGSU>HLPTtfPWLWAAMS PRGQERGT 
SHSOAREPOR PGR WLLGSLQS S PGTLGQAGTASR K RGCM VQRWV 
0VATGRRAV0VPKGALGIJOGETSPGASRGMSGGAGGCWALGWA 
PSPVLPSVTLLEGPPPWLSIISDSGTgRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1312 


277 


GLPAMCST* KAG Y YEETEGDCI P KDR* 1 EKRP FKE 1 *RRIPRIF 
AKQKQ1 * S* N£C*K I GASE 1 DRGR KEADCSDAP AAAR 1 GAVSVFR 
RSTOEARVSPRSNAKSAKl.RAVRAD^WEIJFVL.LFHTPEQFLAEC 
I CR5T* * X* WHOLC* PLSSL* TGI..KRKLLL* VLFRI ♦ WLKDCDV 
*FCOKJFATNFCKWQNL10♦EE'KPVEYSVEN*H3^5NLLLPM*L 
COSSLRDQTI VTVIRM* RNYSMFR i NMISSL* DGS IH I PLKLHFY 
PALI FTLTVPI NS CCQRPLPLFAHQS 1 KTLASSGSPMLACLRFL 
LVKKRAFIHTPRSPGCSV*CKHVI>VKDNKN^CVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQIEAELNKMWRRLLEGLSYYKPPSP 
SSAEKV KAN KDVASPLKBLGLR 2 SKF1^LDEE£SVQI..LQCYLQE 
DYRGTRDSVKTVL0DERQSQAL3 LK1ADYYYEERTCI LRCVLHL 
LTYFQDERHPYR VE YADCVDKLE KELVS KYRQQFEEL YKTBAPT 
WETHGNLMTEROVSRWFVQCLRECSMLLEI I FLYYAYFEMAPSD 
LLVLTKMFKEOGFGSRQTNRHLVDETHDPFVDR IGY FSALI LVE 
GMDI ES tHKCAI^DDRRELKQFAODGLI CQDMDCLMLTFGDI PHH 
APVbIAWALLR^TLNPEET5SV\'RKIGGTAIOLNVF0YLTRLlX) 
S LASGGNDCTTS TACMC VYG LL S FV1»TSI>EL>HTLGNQQDI IDTA 
CEVLADPSLPEL FWGTEPTSGLG 1 1 LDS VCGK FPHLLS PLI/QLL 
RALVSGKSTAKK V YSFLDKMS F YNELYKHKPHDVI SHEDGTLWR 
RQTPKLLYPLGGCTNLRIPQGTVGQVMLDDRAYLVRWEYSYSSW 
TLFTCEI EMLLHWSTADVIQHCORVKPI IDLVHKVI STDLS J A 
DCLLP3 TSR I YMLLORLTTVI S P PVDVIASCWCI/TVLAARNPA 
KVWTDLRHTG FL P FVAH P VSS LS QMI S AEGMN AGG YGNLLMNS E 
0PC«5EYGVTlAFLRLlTTLVKG0i^STQS0GLVPCVMFVLKEMIj 
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PCT/l S00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
i oca t a or. 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleot i rie 
location 
corresponding 
to first 
amino acid" 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptaut- 
UUAlanine, C=Cysteine / D=AsparLic Acid, F.= 
Glutamic Acid, F^Phenyl a} anine , G=Glycine, 
H=Histidine, 1=1 soleuci ne , K=L\£ine, 
L»= Leu cine, M=Methionint; , N-Aspai agine , 
P-Proline, Q=Glutamine , R = Argir.: ne, 
S=Serine, T= Threonine, V=Valine . 
W=Tryptophan, Y=Tyrosine, X=Unk:.own, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion; 








PSYHKWRYNSHGVREQ1GCLI LELI HAILNLCHETDLHSSHTPS 
LOFLC3 CSLAYTEAGOTVINIMGlGVDTIDf^/yiAAQPKSDGAEG 
QGQGQhhl KTVKLAFS VTNNV IRLKF PSNW £ PLEQALSQHGAH 
GNNL I AVLAXY I YH KHDPALPRLA3 QLLXRLATVAPMS VYACLG 
NDAAAI RDAFLTRLQS K\ I E\DMR I K\ VMI L \ EFLT VA\ VETOP 
GLIELFLNLEVKDG\SDGSK£FSLGMW\SCL>AV/VWFLIDSOO 
0DRYWCPPLLHRAA1AFLHALW0DRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPSlLETCALir.KIICLEIYYWKGSLDOP 
LKDTLKKFSIEKRFAYWSGYVKSLAVKVAETLGSSCTSLLEYQM 
LVSAWRMLLIIATTllADIMHLTDSWRRQLFbDVLDGTKALU.V 
PASVNCLRLGSMKCTIALILLRQWKRELGSVDEILGPLTEILEG 
VLQADQQLMEXTXAKVFSAFI TVl.QMKEMKVS DI PQYSQLVLNV 
C ETLQE EV 1 ALFDQTRHS LALGS ATE D KDS Mi. TDDCS R S RH R DQ 
RDGVCVLGLHLAKELCEVDEDGDSWbQVTRRLPlLPTLLTTLEV 
SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGITQS I CLPL 
LSVYOLSTNGTAQTPSASRKSLDAPSWPGVYFLSMSLMEQLLKT 
LRYNFLPEALDFVGVHOERTLOCLNAVRTVOS IJVCLEEADHTVG 
F I LQLS N FMKEWH FHLPQLMRD1 QVN LG YLCQACTS FLHS R KML 
QHYLONKNGDGLPSAV\AQRV\0RPPSAASAA.PSSSK0PA7uOTE 
A5KOOAIJiTVQYGLLKILSKTJLAAl,RHFTP0VCOILLDOSLDLA 
E YNFLFALS FTTPTFDS EVA PS FGTLLAT VNV ALNMLGE LDKKK 
EPLTQAVGLSTQAEGTRTLXSLLMFTMENCFYLLISQAMRYLRD 
PA VI 1 PR DKQRMKQE LS S ELS TLLS S LS R Y FR K G A PS S P ATG VliP 
SPOGKSTSLSKASPESQEPLIQLV0AFVRHMOR 


712S 


1 


1054 


FK R FR WR RRLH * AG PAS S AG GS PG EAS GTMSG H L P PN 1 N I K EPR 
WDQSTF1GRANHFFTVTDPRNI LLTNEQLESAR KI VHDYRQG IV 
PPGLTENELWRAKY 1 YDS AFH PDTGEKMI LI G K MSAQVPMNMT I 
TGCMMTFYRTTPAVLFWQWI NQS FNAWNYTK K SGDAPLTVNEL 
GTA YVS ATTG AVATALGLNAL TKHVS P L I GR F V P FAA V AAAN C 1 
N I PLMRQR ELKVG I PVTDENGNRLGE S ANAAK Q A I TQVWSR I L 
MAAPGMAI PPFIMNTLEKKAFLKRFPWMSAP 1 OVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELOAKIQESHPELRRVYFNKGL 


7130 


2 


760 


HEVPSL/QTSDPLPGSVORCSVWSOPNKENWCCDHLYNSLGRKG 
ISAKSOPYHRSQSSSSVLTNKSMDS1NYPSDVGKQQLLSLHRSS 
RCESHODLLPDIADSHQOGTEKLSDLTLQDSOXVWVNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DS K FVDADFSDNVCSGNTLHS LNS PR TP KKP W S KLGLS PYLTP 
YNDSDKLNDYLWRGPSPNQQNIVOSLREKFQCLSSSSFA 


7131 


805 


573 


AAAEGHI EWKFLI EACXVNP FAKDR WGNI PLDDAVQFNKLEW 
KLLQDYQDSYTLSETQAEAAAEALSKENLESMV 


7132 


1420 


1067 


I DMLLLSG ALVSGP YTL I TTAVS ADLGTH KS LK GN AHALS. TVTA 
1 1 DGTGS VGAALGPLLAGLLS PSGWSNVFYWLK FADACALLFL I 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


00 I PGLLPAHGESGDALRKPRLQKPI TG HLDDLFFTL YPSLEXF 
EEELLELHVQDHFQEGCGPLDGGALE I LERRLR VGVHNGLGFVQ 

RLPEMVGHPAFAVI FQLE YVFSS PAG VDGNAAS VTSLSNLACMH 
MVR WAVWNPLLEADSGR VTLPLQGG I QPNPSKCLVY KVPS ASMS 
SEEVKOVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPONSPVGPGLS I SQLAASPRSPTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAG I SHLEADLSQTSLVLETS 
I AEQLOELPFTPLHAP I WGTQTRS SAG0PSRAS?4VLL0S SG FP 
E I LDAN KQPAEAVS ATEPVTFN PQXEE S DCLQS NEMVLQ FLAPS 
RVA0BCRGTSWPKTVYFTFQFYRFPPATTPRLCLVOlJ>EAG0PS 
SGALTHILVPVSRDGTFDAGSPGFOLRYMVGPC-FLKPGERRCFA 
R YLAVQTLQI DVWDGDS LLLI GSAAVQM KHLLT< QGRPAVQASHE 
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SEC 

ir 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
axino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleoti de 
locat :or_ 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Amno acid Ftgment ccntaining signal peptide 
iA«-Alanine, (NCysteine, D=A£partic Acid, E = 
Glutamic Acid, ^Phenylalanine , G»Glycint, 
H=Histidine, I-lsclt'ucine, K=liysine, 
L^Leucine, M^Kethionine, N=Asparagine, 
P=Proline, 0=G]utami ne, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
W=Tryptophar», Y=Tyrocine / X=Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 


7134 






LEWATEYEODNMVVSGDMLGFGRVKPIGVHSWKGRLHLT'LAK 
VGH PCEQKVRGCS TLPPSKSRV1 SNDGASR FSGGSLLTTGSS RR 
KHWQAOKLAD V PS E hAAMLLTHAROCKGPQDVSRESDATRJi. R X 
LERMRSVRLOEAGGDLGRRGTSVIAOOSVRTOHLRDLQVIAAYR 
ERTKAESIASLbSlAlTTEHTLHATLGVAEFFEFVLKNPHNlXJH 
TVTVEIDNPELPV1VDSQEWRDFKGAAGLKTPVEEDMFHLRGSL 
APQbY LR PHETAHV P FK FQS FSAGQLAMVQAS PGLSNEXGMPAV 
SPWKSSAVPTKJ^.KVliFRA.SGGKPlAVLCLTVELQPHWDOVFR 
FYHPELSFLKKA) RLPPWHTFPGAPVGMLGEDPPVHVRCSDpNV 
ICETQNVGPGEF RD1FLXVASGPSPEI KDFFVI I YSDRWLATPT 
0TW0 V YLH S LOK VDVSCVAGOLTRLSLVLRGTQTVRKVRAFTSH 
PQELKTDPKG VF VLP PRG VQ3LHVG VR PLRAG S RFVMLNLVDVD 
CHQLVAS WLVCLC CRQPL 3 SKAFE I MLAAGEG KGVNKR ITYTNP 
YPSRRTFHLHEDHFELLRFREDSFQVGGGETYTIGLQFAPSORV 
GEEEIL1Y3NDKEDKNEEAFCVKVIY0 


2115 


ill j 


GGEG F S Y P FHVGIjS LGTPLDPHY Vl»liE VHY DN PTY EEGL1 DNS G 
LRbFY TMD3 RKYPAGVJ EAGLWVSLFHTI PPGMPEFQSEGKCTL 
ECLEEALEAEKP^Gl HVFAVLUHAHLAGRG3 R LRHFRKGKEMKL 
LAYDDDFDFNFCFFQYLKEE0T1LPGDNL1TECRYNTKDRAEMT 
WGGLSTRSEMCLf YLLYYPRINLTRCAS1 PDIMEQLQFIGVKE3 
YRPVTTWPFI 1 K S P KQ YKN hS FMDAMNKFKWT XXEGLS FN K LVb 
SLPVNVRCSKTDrC/vEMSlOGMTALPPDIERPYKAEPLVCGTSSS 
SSLHRDFSINLIA'CLLLLSCTLSTKSL 


7135 


2 


2072 


FVPRVTPRSLSLOGPKGESVGS1TQPLPSSYL1FRAASESDGRC 
WLEALEIALRCSSLLRLGTCKPGRDGEPGTSPPASPSSLCGLPA 
SATV}JPD0DLFP1^GSSLFNDAFSDKSERENPEESDTET0DHSR 
KTESGSDOSETPGAPVRRG'iTYVEOVOEELCELGEASQVEr^SE 
ENKSLMWTLLKObSPGMDLSRVVLPTFVLEPRSFLNKLSDYYYH 
ADLLSRAA VEEDA Y S RMKLVLR WYbSGFYKKPKG I KKP YNP I LG 
ETFRCCWFHPQTDSRTFY3AE0VSHHPPVSAFHVSNRKDGPCIS 
GS 1TAKSR FYGHS LS ALLDGKATLTFLNRAEDYTLTMP YAH CKG 
I LYGTMTLELGGK VTI ECAKKNFQAQLEFKLKPFFGGSTS I NQI 
SGKITSGEEVLASI.^GHWDRDVFIKEEGSGSSALFWTPSGEVRR 
ORLRQHTV PLEEOT ELESER LWOHVTRAI S KGDQHRATQEKFAb 
EEAORORARERQKSLMPWKPOl'FHLJDPlTOEWHYRYEDHSPVJDP 
LKDIAOFEODGn-KTbCXJEAVAROlTFLGSPGPRHERSGPDORL 
RKASDOPSGHSOA'fESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALKEAILS 1 REAQOBLHRHLSAWLSSTARAAOA 
PTPGLLOS PRS WF b I jCVFLA CQLFINH 1 1»K 
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DFVPS FRR PSGNTc QTVWLLRAATLEKEVAGl.REKIHHLDDMLK 
SQQRKVRQMI EQLQN SKAV 1 OS KDATIQELKEK J AYLEAENLEM 
HDRMEHL3EKQ1SHGNFSTQARAKTENPGSIR3SKPPSPKPMPV 
IRWET 
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w a«;gm ^ tv pggs r *^ <; i i n vrgg wgvtgg es E5 LTV p VADT woa 
GSFKVATOERNPOFsAOMRLRROKXGWPFLGDFLTELQRLDSAI 
PDDLDGNTN X RS K E VR VLO EMOLLQ VAAMN YRLR PLE K FYT Y FT 
RMEQLSDKESYKLSCQLEPENP 
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WASGMSTVPGGSR)^SLGI0VRGGHGVTGGEEE5LTVPVAimi0A 
GSFKVAT0ERNP0RAOMRLRROKKGWPFLGDFLTELORLDSAI 
PDDLDGNTNXRSKE VR VLOEMQLU>VAAMNYRLRPLEXFVTY FT 
RMEQhSDKZSYKLS CQLEPENP 
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SLRNSARGLXKAfcS AARGAAALRRSINQPVAFVRR I PWTAASSQ 
LKEHFAOFGHVRRCILPFPKETGFHRGLGIA^OFSSEEGLRNALO 
QENHI IDGVKVQWTRRPKLPQTSDDEKKDF 
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RASSbQVLKAWGGL i PSSFOOOHTGQYALEELFDLKVYDCPCS F 
NMNVSLEK0LRPSCPWPRGKCRKTPGWEEARPKAQDLRGDU5KT 
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S£0 
1 D 
NO ■ 


Predicted 
beginning 

I . W 1. r _ — 1UC 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Fredactec end 
nucleotide 

corresponding 
to tirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA^Alanine , C= Cyst e.: ne , D=Aspartic Acid, 
Glutamic Acid, F = Phenylalanine / G~Glvcine 
K=Hi stidine, I=Iso: cucine, K=Lysine, 
I.s Leucine, M*=Ke thicnirie , N*/\sparagine , 
P=Proline, Q=Glutc>n:ne , R=Arginine, 
S=Serine, T=Threomne , V=Valine, 
W=Tryptophan, Y=Tyrosane, X= Unknown , **Stop 
Codon, /=poseible nucleotide deletion, 
\=possible nucleotide insertion) 








OAGPAEAHTRGPPRLPAATGCPPHIjPGLLSGISVDIDPTGLOSO 
KT P KGQDP P LMFS ED YQ K £ L LEQY K LGLDQKLRKYVVGEL 1 WNF 
ADFMTNQCG 
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LDSRSCWLDMEDLEEDVRF^VDETLDFGGLSPSDSREEEDITVL 

EANRLAAQLEOCALQDRESAGEGLGPRRVypSPRRETFVLKDSP 
VRDLLPTVNSLTRSTPS/LKOPDASTPE* * *EGVS0GSPGY1WK 
EALOHEEGVTHbOSVPC10KPSIFSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAJVKSSCOLPIPSAIPRPASRMPLTSRSVPP 
GRGA1.PPDSLSTRKGLPRPSTAGHRVRESGHKYPVSQRLNLPVM 
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LI FLMLHMELKMLSS VTLH I RAFLYWI CLKPTSCLI FQNVLNiA 
K K * S RAVGWWM CRT/YSS PLQ VG VI K P WLLLG SQDAAHDLDT 
I>K KNKVTH I LNVA YG VENAF1 i SDFTY K S I S I LDLPETN I LS YFP 
ECFEFIEEAKRKDGWLVHCNA 
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SLEMSSDGEPLSRMESEDS1SSTIMDVDSTISSGRSTPAMMNGQ 
GSTTSSSKN1AYNCCWD0C0ACFNSSPDLADHIRSIHVDGQRGG 
VFVCbWKGCXVYNTPSTS0SW3,0RHMLTHSGDKPFKCWGGCNA 

KLKKKRRRS^-RPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
HSWFHSTVS1LLFF0IKYKTL.QKNISTI ISKSLKJ 


7144 


1 


988 


FRVNMQDGGP S PAEHS KAEE.S AGMEARFLGLPDAA6S SG PTPAR 
RCPAPR PAGVS YVI R DE VEK YKRNG VNALQLDPALNRLFTAGRD 
S 1 3 R I WS VNOH KQDP Y I AS M EH HTDWVND I VLCCNG KTLI S AS S 
DTT VKVWNAH KGFCM S TLRTH K D YV KAIiAYAXDKELVAS AG LDR 
03FLWDVNTLTALTASKNTVTTSSLSGNK0SIYSLAMNQLGTII 
VSGSTEKVLRVWDPRTCAKLMKLKGHTPNVKALLLNRDGTOCLS 
G SS DGT 3 R LWS LGQOR CI AT Y R VHDEGVWALQVNDAFTH VYSGG 
RDRK3YCTDLRNPD3RVL1 CI 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:] -1786 and 3573-5358. a mature protein coding portion 
of SEQ ID NO:M786 and 3573-5358, an active domain of SEQ ID NO:!-] 786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeplide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim I wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim ] wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1. 

8. A host cell genetically engineered to comprise the polynucleotide of claim 3 . 

9. A host cell genetically engineered to comprise the polynucleotide of clairn 3 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: I - 3 786 and 3 573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim I for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

1 7. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

1 9. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:M786 and 3573-5358, an active 
domain of SEQ ID NO: 1-1 786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ JD NO:l-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:]787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1 786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. Hie collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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